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The  power  spectrum  is  fundamental  to  signal  analysis  and  understanding  and  has 
found  many  applications  in  communications,  speech,  and  radar.  One  of  the  disadvantages 
of  using  the  power  spectrum  is  that  the  estimate  of  the  power  spectrum  is  inconsistent, 
meaning  that  the  variance  of  the  estimate  does  not  approach  zero  even  when  the  data  length 
grows  infinitely  large. 

Lim  and  Harris  have  developed  a simple  analog  IC  for  Ratio  Spectrum  (RS)  com- 
putation. The  RS  is  formed  by  computing  the  ratio  of  the  power  of  a low-pass  filtered 
signal  to  the  power  of  the  original  un-filtered  signal  for  all  possible  filter  cutoff  frequen- 
cies. The  RS  has  many  advantages  over  the  traditional  power  spectrum,  particularly  in 
terms  of  improved  feature  extraction  and  efficient  implementation  in  analog  circuitry. 

We  first  study  the  important  result  that  the  integrated  spectrum  and  the  ratio  spec- 
trum are  consistent  estimators  unlike  the  widely  used  power  spectrum.  This  property  makes 
the  integrated  spectrum  and  the  ratio  spectrum  more  useful  for  some  applications  than  the 
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conventional  power  spectrum.  In  addition,  we  derive  the  mean  and  variance  of  the  ratio 
spectrum  and  obtain  an  approximate  closed-form  solution.  As  a result,  the  ratio  spectrum 
estimate  is  asymptotically  consistent,  which  is  verified  by  the  simulations. 

We  next  propose  a detection  algorithm  for  unknown  signals  using  the  RS.  In  the  al- 
gorithm, we  sample  the  ratio  axis  to  enhance  the  signal  to  noise  ratio  such  that  the  detection 
performance  can  be  improved.  We  also  analyze  the  detection  performance.  Furthermore, 
we  show  that  the  performance  of  the  algorithm  is  better  than  the  conventional  GLRT  for 
bandlimited  signals  with  fewer  assumptions  made.  In  addition,  we  propose  a RS  domain 
based  algorithm  for  parameter  estimation.  Using  the  consistency  property  of  the  RS,  we 
show  that  improved  performance  can  be  achieved. 

An  improved  analog  IC  for  the  analog  implementation  of  the  RS  has  been  designed, 
implemented  and  tested.  In  the  chip,  a tenth-order  Butterworth  gm-C  LPF  is  designed,  and 
the  cutoff  frequency  of  the  LPF  can  be  widely  adjusted  from  100Hz  to  5KHz.  We  describe 
the  circuits  and  discuss  their  design  considerations,  such  as  the  cascade  sequence  issue  for 
the  filter.  In  the  implementation  of  the  RS,  we  analyze  the  convergence  performance  and 
steady  state  error,  which  are  important  issues  for  any  adaptive  system.  Using  piecewise 
linear  techniques,  we  derive  equations  for  computing  the  convergence  time,  which  are 
verified  by  simulations  and  chip  measurements.  The  power  computation  circuit  consists  of 
a square  and  a LPF.  We  also  discuss  the  selection  of  the  cutoff  frequency  in  the  LPF.  One  of 
the  advantages  of  using  this  design  is  that  the  short  time  window  needed  for  non-stationary 
signal  can  be  achieved  by  the  design. 

Finally,  we  conduct  the  chip  test  for  both  assumed  signals  and  real  speech  signals. 
In  the  chip  test,  1%  deviation  is  achieved  for  a pure  sinusoid  signal  input,  and  less  than  7% 
deviation  is  achieved  for  the  real  speech  signals.  As  a result,  the  main  features  can  be  rec- 
ognized by  using  the  RS  obtained  from  the  chip.  We  believe  that  the  analog  ratio  spectrum 
offers  an  alternative  to  conventional  filter  banks  for  signal  processing  applications. 
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CHAPTER  1 
INTRODUCTION 


The  power  spectrum  is  fundamental  to  signal  analysis  and  understanding  [l]-[3] 
and  has  found  many  applications  in  communications,  speech  and  radar.  One  of  the  disad- 
vantages of  using  the  power  spectrum  is  that  the  estimate  of  the  power  spectrum  is  incon- 
sistent. As  a result,  the  variance  of  the  estimate  does  not  approach  zero  even  when  the  data 
length  grows  infinitely  large.  To  obtain  a consistent  estimate,  the  integrated  spectrum  with 
and  without  normalization  has  been  discussed  in  [4]-[6], 

Lim  and  Harris  have  developed  a simple  analog  IC  for  Ratio  spectrum  (RS)  compu- 
tation [7]-[8],  The  RS  is  formed  by  computing  the  ratio  of  the  power  of  a low-pass  filtered 
signal  to  the  power  of  the  original  un-filtered  signal  for  all  possible  filter  cutoff  frequen- 
cies. The  RS  has  many  advantages  over  the  traditional  power  spectrum,  particularly  in 
terms  of  improved  feature  extraction  and  efficient  implementation  in  analog  circuitry.  The 
RS  has  been  shown  to  be  competitive  with  the  best  existing  speech  recognition  techniques 
for  phoneme  recognition  [7] -[9].  It  has  also  been  used  successfully  in  speech  and  audio 
coding  applications  [10]-[11],  Compared  with  relevant  analog  cochlear  implementation 
[12]-  [18]  in  speech  processing,  the  RS  efficiently  combines  spectral  analysis  with  feature 
extraction  to  provide  a simple,  powerful  front-end  processing  tool.  Both  analog  and  digital 
versions  of  the  RS  have  been  defined  and  implemented. 

However,  many  fundamental  questions  were  left  unanswer  from  the  previous  work. 
One  obvious  question  is  what  are  the  fundamental  reasons  behind  the  good  performance 
in  the  above  applications.  Further  questions  involve  the  convergence  rate  and  steady  state 
error  of  the  ratio  spectrum.  If  we  answer  these  questions,  we  can  develop  many  other 
applications,  and  develop  improved  hardware  for  application  for  real  speech.  Motivated  by 


1 


2 


the  above  goals,  this  dissertation  conducts  research  both  in  signal  processing  and  analog 
IC  design. 

Chapter  2 of  this  dissertation  introduces  the  power  spectrum,  the  integrated  spec- 
trum and  the  ratio  spectrum.  We  derive  their  distributions  (mean  and  variance).  Also, 
we  derive  the  approximate  quantitative  relationship  between  uniform  sampling  on  the  fre- 
quency axis  and  uniform  sampling  on  the  ratio  axis.  Sampling  along  the  ratio  axis  of  the 
RS  generates  nonuniform  sampling  along  the  frequency  axis. 

Chapter  3 describes  new  applications  in  detection  and  parameter  estimation.  First, 
we  propose  the  RS  algorithm  for  the  problem  of  detection  of  arbitrary  signals  in  noise.  We 
show  that  the  RS-based  detection  scheme  outperforms  the  standard  generalized  likelihood 
test  (GLRT)  for  bandlimited  signals.  We  also  discuss  a novel  wavelet-based  detection  tech- 
nique. Numerical  simulations  verify  the  performance  of  the  RS-based  detection  algorithm. 
Second,  motivated  by  the  consistent  estimate  of  the  RS,  we  propose  a RS-based  algorithm 
for  parameter  estimation.  We  also  discuss  its  performance. 

Chapter  4 will  present  the  short-time  analog  RS  implementation.  In  many  appli- 
cations, such  as  speech  processing,  the  signals  are  non- stationary  but  are  approximated  as 
locally  stationary  within  small  windows.  The  selection  of  the  type  and  size  of  the  window 
function  is  crucial  and  dependent  on  the  particular  application.  Then,  we  discuss  the  effect 
when  the  low-pass  filter  is  not  ideal,  but  a tenth-order  Butterworth  LPF.  In  this  case,  the 
window  is  a constant  Q filter,  which  can  be  modelled  as  a Gaussian  function  in  frequency 
domain.  In  addition,  we  derive  the  expression  for  the  computation  of  the  steady  state  error, 
and  use  piece-wise  linear  techniques  to  obtain  an  expression  for  the  convergence  time  of 
the  ratio  spectrum  for  both  its  continuous-  and  discrete-time  forms.  We  verify  the  results 
through  numerical  simulations. 

Chapter  5 details  the  analog  hardware  implementation  of  the  RS  circuit.  We  inte- 
grate a tenth-order  Butterworth  low-pass  filter,  together  with  circuitry  for  computing  the 
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signal  power  at  the  input  and  the  output  of  the  filter.  A control  system  (also  in  analog  hard- 
ware) then  modifies  the  cutoff  frequency  of  the  LPF  in  such  a way  as  to  match  the  correct 
power  ratio.  SPICE  simulations  verify  the  performance  of  the  circuits.  Also,  the  chip  test 
results  for  both  assumed  signals  and  real  speech  signals  are  reported. 

Finally,  Chapter  6 summarizes  the  current  results,  contributions  and  future  work 
for  this  dissertation. 


CHAPTER  2 

SPECTRUM,  INTEGRATED  SPECTRUM,  RATIO  SPECTRUM  AND  THEIR  PROPERTIES 

2.1  Introduction 

In  this  chapter,  we  first  discuss  the  distribution  of  the  widely-used  power  spec- 
trum, followed  by  the  distributions  of  the  integrated  spectrum  and  the  ratio  spectrum.  We 
prove  an  important  result  that  the  integrated  spectrum  and  the  ratio  spectrum  are  consistent 
estimators  unlike  the  widely-used  power  spectrum  (periodogram).  This  property  makes 
the  integrated  spectrum  and  the  ratio  spectrum  more  useful  than  the  power  spectrum  for 
many  applications.  It  should  be  pointed  out  that  the  derivations  are  different  from  previous 
work  in  this  area  [4],  although  the  ideas  and  results  are  similar.  In  addition,  we  derive  the 
variance  of  ratio  spectrum  and  obtain  the  approximate  closed-form  solution  and  its  dis- 
tribution, which  is  considered  as  difficult  in  [4],  Finally,  each  conclusion  is  verified  by 
computer  simulations. 

After  discussing  the  spectrum,  the  integrated  spectrum,  the  ratio  spectrum  and  their 
distributions,  we  will  introduce  a powerful  property  of  the  RS,  uniformly  sampling  along 
the  RS  axis  rather  than  uniformly  sampling  along  the  frequency  axis.  The  RS  axis  sampling 
tends  to  place  more  samples  in  regions  of  high  power  and  thus  increase  the  signal  to  noise 
ratio  (SNR).  This  increase  in  SNR  directly  leads  to  an  increase  in  detection  probability  as 
is  discussed  in  chapter  3. 

2.2  Power  Spectrum 

There  are  two  definitions  of  the  power  spectrum  density  estimation,  and  each  corre- 
sponds to  a different  computation  method.  They  are  the  periodogram  and  the  correlogram. 
The  periodogram  is  defined  as 
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where  Pp(v)  is  the  periodogram,  x(i)  is  the  input  signal  and  —n  < u < n.  The  correlo- 


PM  = (2.2) 


(2-2) 


*=— (JV— 1) 


where  Pc(u)  is  the  correlogram,  f(i)  is  the  covariance  function  estimation  of  the  input 


signal. 


Those  two  estimations  of  PSD  can  be  equivalent,  discussed  in  [19],  However,  in 
practical  applications,  the  periodogram  is  often  used  due  to  the  use  of  the  efficient  FFT.  If 
the  DFT  is  used,  the  power  spectrum  can  be  written  as 


length  of  the  data.  In  our  discussion  of  PSD  estimation,  the  spectrum  means  the  peri- 
odogram unless  otherwise  specified.  In  the  estimation,  the  mean  and  variance  of  the  esti- 
mation are  two  major  criteria.  We  wish  that  the  mean  of  the  estimate  is  equal  to  the  true 
value,  and  the  variance  of  the  estimate  is  as  small  as  possible. 

We  consider  first  the  case  when  x(i)  is  Gaussian  white  noise  with  mean  0 and 
variance  a2,  i.e.,  x(i)  ~ iV(0,  a2),  i = 0, 1,  • • • , N.  Then,  the  spectrum  is 


Px(k)  = jf  0 < k < N — 1 (2.3) 

where  x{i)  and  Px(k)  are  the  input  signal  and  its  periodogram  respectively,  and  N is  the 


(2-3) 


m 


N-l 


N- 1 


A2(k)  + B2(k) 


(2-4) 


where 


(2-5) 


and 


(2-6) 
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It  now  follows  that  A(k),  being  a linear  combination  of  the  x{i),  is  also  normally 
distributed  with  zero  mean  and  variance  given  by 


In  particular,  using  the  result  that  the  mean  and  variance  of  a \n  distribution  [20]  are  n,  2 n, 
respectively,  we  have 


From  the  above  derivation,  the  mean  of  the  estimation  is  equal  to  the  true  value, 
so  the  estimation  is  unbiased.  Also,  the  variance  is  independent  of  the  data  length  N,  so 
the  estimate  is  inconsistent.  To  verify  the  results  given  in  Eq.  2.11  and  Eq.  2.12,  Fig.  2.1 
shows  the  waveform,  power  spectrum,  the  mean  and  variance  of  the  spectrum  for  white 
noise  with  N = 512  and  a2  = 2.  Fig.  2.1(a)  shows  the  noise  waveform  in  the  time  domain, 
while  Fig.  2.1(b)  shows  its  power  spectrum.  We  can  see  that  the  spectrum  is  a random 
waveform  although  its  mean  is  constant  2.  Fig.  2.1(c)(d)  show  the  mean  and  variance, 
which  are  obtained  by  averaging  1000  trials.  The  mean  is  close  to  the  analytic  value  2, 
while  the  variance  is  close  to  4.  In  the  next  simulation,  the  conditions  are  the  same  as 


k ± 0,  Nj 2 
k = 0,  N/2 


(2.7) 


Thus, 


k ± 0,  N/2 
k = 0,  N/2 


(2.8) 


A similar  argument  shows 


k ^ 0,  N/2  - 
k = 0,  N/2 


(2.9) 


As  a result,  Px(k)  is  independently  distributed,  and  for  each  k, 


k ^ 0,  N/2 
k = 0,  N/2 


(2.10) 


E[Px(k))  = a 2 


(2.11) 


and 


k ± 0,  N/2 
k = 0,  N/2 


(2.12) 


Magnitude 
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The  waveform  of  white  noise  The  power  spectrum 


Frequency(HZ)(o2=2,  N=512)  FrequencylHZXo2^,  N=512) 

(c)  (d) 

Figure  2. 1 : The  power  spectrum  of  white  noise  (N  = 512,  a2  = 2).  (a)  Waveform  of  white 
noise  in  the  time  domain;  (b)  Power  spectrum;  (c)  The  mean  of  the  power  spectrum;  (d) 
The  variance  of  the  power  spectrum. 
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The  waveform  of  white  noise  The  power  spectrum 


TIME  {<^=2.  N=1024)  Frequency(HZ)(o2=2,  N=1024) 


(a) 

The  mean  of  the  power  spectrum 


simulated 

2.2  - - - analytical 


0 0.1  0.2  0.3  0.4  0.5 

Frequency(HZ)(o2=2,  N=1024) 


(b) 


The  variance  of  power  spectrum 


simulated 

7 - analytical 

6.5  - 


(c) 


(d) 


Figure  2.2:  The  power  spectrum  of  white  noise  ( N — 1024,  a2  = 2).  (a)  Waveform 
of  white  noise  in  the  time  domain;  (b)  The  Power  spectrum;  (c)  The  mean  of  the  power 
spectrum;  (d)  The  variance  of  the  power  spectrum. 
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above  except  N = 1024,  and  the  results  are  shown  in  Fig.  2.2.  We  can  observe  that  the 
variance  of  the  spectrum  is  still  close  to  4 and  therefore  is  unchanged  even  though  the  data 
length  is  doubled.  From  these  figures,  we  can  see  that  the  simulation  results  agree  with  the 
analytical  results  given  by  Eq.  2.1 1 and  Eq.  2.12. 

Next,  we  consider  the  case  of  a much  more  general  signal  obtained  by  linearly 
filtering  noise  e(i)  (i.e.  the  general  ARMA  model): 

OO 

y(k)  = ^2hke(k  - i)  (2.13) 

i=  1 

Then,  for  N >>  1,  (the  proof  can  be  found  in  page  424  of  [4],  or  page  21  of  [19]) 

Py{u)  = \H(P“)\2Pe(u)  + 0(\/N'/2)  (2.14) 


where  Py(u)  denotes  the  power  spectrum  estimate  of  the  output  y;  Pe(oj)  the  power  spec- 
trum  estimate  of  noise  e(i);  0(1/NP2)  = (c:  a finite  coefficient)  and 


H(en  = J2h*e~J‘ 


-jujk 


(2.15) 


k= i 


As  such,  if  the  e(i)  are  normal,  asymptotically,  the  set  of  random  variables  Py(u>k), 
k = 0, 1, ...,  N/2,  ( iok  = 2nk/N),  are  independently  distributed,  and  we  have 


Py  ) 


\PyMxl  N/2 


PyMxi  k = 0,  N/2 

In  particular,  the  above  gives  (for  k ^ 0,  N/2) 


(2.16) 


E[PyM\  = PyM 


(2.17) 


and 

var[Py(ujk)\  = Py(uk)  (2.18) 

where  Py(u>)  denotes  the  mean  of  the  power  spectrum.  The  last  result  shows  var[Py(ujk)\  ^ 
0 (as  N — > oo).  As  a result,  the  periodogram  is  not  a consistent  estimator  of  the  spectrum. 
The  restriction  that  uuk  = 2-rrk/N  is  not  important  as  far  as  the  mean  and  variance  of  Py(uj) 
are  concerned;  in  fact,  the  above  result  holds  for  general  lu. 
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The  waveform  of  AR  noise  The  power  spectrum 


The  mean  of  the  power  spectrum  The  variance  of  power  spectrum 


(c) 


(d) 


Figure  2.3:  The  power  spectrum  of  AR  noise  (N  = 512,  a2  = 2,  a = 0.6).  (a)  Waveform 
of  AR  noise  in  time  domain;  (b)  The  power  spectrum;  (c)  The  mean  of  the  power  spectrum; 
(d)  The  variance  of  the  power  spectrum. 


Magnitude  Magnitude 
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The  waveform  of  AR  noise 

~\  i 1 r 


-6 1 1 1 1 1 1 

0 200  400  600  800  1000  1200 

TIME  (ct^=2,  N=1024) 

(a) 


The  mean  of  the  power  spectrum 


Frequency(HZ)(o^=2,  N=1024) 

(b) 


The  variance  of  power  spectrum 


(c) 


(d) 


Figure  2.4:  The  power  spectrum  of  AR  noise  (N  = 1024,  a2  - 2,  a = 0.6).  (a)  Waveform 
of  AR  noise  in  time  domain;  (b)  The  power  spectrum;  (c)  The  mean  of  the  power  spectrum; 
(d)  The  variance  of  the  power  spectrum. 


12 


To  verify  the  results  given  in  Eq.  2.17  and  Eq.  2.18,  Fig.  2.3  shows  the  wave- 
form, power  spectrum,  the  mean  and  variance  of  the  spectrum  respectively  for  AR(1)  noise 
(y(n)  = ay(n  — 1)  + e(n))  with  = 2,  N = 512,  a = 0.6.  Fig.  2.3(a)  shows  its  wave- 
form in  the  time  domain,  while  Fig.  2.3(b)  shows  its  power  spectrum.  From  the  figure, 
the  spectrum  is  a noisy  waveform,  oscillating  around  its  mean.  As  before,  the  mean  and 
variance  are  obtained  by  averaging  1000  trials,  and  are  shown  in  Fig.  2.3(c)(d).  The  mean 
is  close  to  the  analytical  value,  while  the  variance  is  proportional  to  the  square  of  its  mean. 
In  addition,  Fig.  2.4  shows  the  case  corresponding  to  N = 1024.  We  see  the  variance  is 
not  changed  although  the  data  length  is  doubled.  From  those  figures,  we  can  observe  that 
the  simulation  results  agree  with  the  analytical  results  shown  in  Eq.  2.17  and  Eq.  2.18. 

From  these  two  simulations  for  both  white  noise  and  AR  noise,  we  observe  that  the 
estimate  of  the  spectrum  is  unbiased,  but  the  variance  of  the  spectrum  is  approximately 
the  square  of  spectrum  magnitude  independent  of  N.  In  other  words,  the  variance  of 
the  estimate  is  not  reduced  by  increasing  the  data  length  N [21],  Therefore,  the  power 
spectrum  estimate  is  a bad  or  flawed  estimate  since  it  is  not  consistent.  To  overcome  the 
weakness,  the  windowed  spectrum  is  often  used.  However,  the  selections  of  the  window 
type  and  window  length  is  not  simple  [3],  One  of  other  solutions  is  the  integrated  spectrum, 
which  will  be  discussed  in  the  next  section. 


After  discussing  the  power  spectrum,  let  us  look  at  the  integrated  spectrum  [4],  [22] 
defined  as 


mate  respectively,  and  0 < u < -it. 

Now  we  derive  the  expression  of  the  mean  and  variance  of  the  integrated  spectrum, 


2,3  Integrated  Spectrum 


where  Iy(u)  and  Py( uj)  are  the  integrated  spectrum  estimate  and  the  power  spectrum  esti- 
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r 

Jo 


'UJ 


Py(x)dx 


(2.20) 


and 


var[Iy(u)\  = var[  / Py{x)dx\ 


o 

i 


i= 1 

l 


(2.21) 


where  l = [Nlu/tc],  an  integer  close  to  Nu/n. 

From  the  above  derivation,  we  can  see  that  the  mean  of  the  integrated  spectrum 
estimate  is  equal  to  the  true  value,  and  the  variance  is  asymptotically  close  to  zero  as  N 
approaches  infinity.  Thus  the  estimate  of  the  integrated  spectrum  is  a consistent  estimate. 
This  property  makes  it  useful  in  many  applications  such  as  system  identification  [4]  and 
parameter  estimation,  which  will  be  discussed  in  the  next  chapter.  In  addition,  the  distri- 
bution can  be  viewed  as  a Gaussian  asymptotically  [4], 

After  deriving  the  analytical  results  of  the  integrated  spectrum,  we  will  verify  these 
results  by  using  two  simulation  examples.  In  the  first  simulation,  Fig.  2.5  shows  the 
waveform,  the  integrated  power  spectrum,  the  mean  and  the  variance  for  white  noise  with 
cr2  = 2,  N = 512.  Fig.  2.5(b)  shows  the  integrated  spectrum,  which  is  not  as  random  as  the 
power  spectrum.  Fig.  2.5(c)(d)  show  the  mean  estimate  and  variance  estimate  respectively, 
which  are  obtained  by  averaging  1000  trials.  The  mean  estimate  is  shown  in  Fig.  2.5(c), 
which  is  very  close  to  the  analytical  mean.  Unlike  the  large  variance  of  the  power  spectrum 
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estimate,  the  variance  of  the  integrated  spectrum  estimate  is  very  small  with  a maximum 
around  2 x 10-3,  while  the  variance  of  the  power  spectrum  is  4.  In  the  second  simulation 
where  the  same  conditions  with  first  simulation  is  used  except  N = 1024,  the  results  are 
shown  in  Fig.  2.6.  As  expected,  the  variance  of  the  integrated  spectrum  is  smaller  than 
that  of  the  first  simulation,  and  is  reduced  to  half  since  double  the  data  length  is  used.  In 
contrast,  the  variance  of  the  conventional  power  spectrum  is  independent  of  N. 

Furthermore,  Fig.  2.7  and  Fig.  2.8  show  the  cases  of  AR  noise  (y(ri)  = ay(n  — 
1)  + e(n),  cTg  = 2,  a = 0.6)  with  N = 512  and  N = 1024  respectively.  For  the  white  and 
AR  noise,  we  observe  the  estimate  of  integrated  spectrum  is  unbiased,  and  the  variance  of 
the  integrated  spectrum  is  inversely  proportional  to  N,  asymptotically  zero.  Therefore  the 
estimate  of  integrated  spectrum  is  a consistent  estimate.  These  simulation  results  verify 
the  analytical  results  given  by  Eq.  2.20  and  Eq.  2.21. 


2,4  Ratio  Spectrum 


We  now  discuss  the  ratio  spectrum.  In  many  applications,  the  ratio  spectrum  is 
found  more  useful  than  the  integrated  spectrum  due  to  its  normalization.  For  example,  a 
constant  scaling  of  the  integrated  spectrum  can  be  ignored  since  the  numerator  and  de- 
nominator would  both  be  scaled  by  the  same  value.  The  ideal  ratio  spectrum  or  normalized 
integrated  spectrum  is  defined  as 


F(u) 


= Ip  Py(x)dx 

fo  Py{x)dx 


and  its  discrete  form  is 


F{uip)  = 


Ell  Py(Wq) 


EL  Py K) 


(2.22) 


(2.23) 


where  Py(x)  is  the  spectrum  of  y(t)  given  in  Eq.  2.13,  and  the  numerator  is  the  integrated 
spectrum  and  the  denominator  is  the  total  power  over  the  entire  frequency  range.  Also,  we 
here  consider  the  case  uq  = 2 n/Nq  for  simplicity. 
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The  waveform  of  white  noise 
“I 1 1 T 


-5 1 1 1 1 1 

0 100  200  300  400  500  600 

TIME  (<t2=2,  N=512) 


The  integrated  spectrum  (one  trial) 


(a) 

The  mean  of  the  integrated  spectrum 


(b) 


(c) 


(d) 


Figure  2.5:  The  integrated  power  spectrum  of  white  noise  (JV  = 512,  a2  = 2).  (a)  Wave- 
form of  white  noise  in  the  time  domain;  (b)  The  integrated  power  spectrum;  (c)  The  mean 
of  the  power  spectrum;  (d)  The  variance  of  the  power  spectrum. 
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(a) 

The  mean  of  the  integrated  spectrum 


(b) 


(c) 


(d) 


Figure  2.6:  The  integrated  power  spectrum  of  white  noise  (N  = 1024,  a2  = 2).  (a) 
Waveform  of  white  noise  in  the  time  domain;  (b)  The  power  spectrum;  (c)  The  mean  of  the 
power  spectrum;  (d)  The  variance  of  the  power  spectrum. 


Magnitude  ^ Magnitude 


6 


The  waveform  of  AR  noise 


The  integrated  spectrum  (one  trial) 

I 1 


Figure  2.7:  The  integrated  power  spectrum  of  AR  noise  (N  = 512,  a2  = 2,  a = 0.6). 
(a)  Waveform  of  AR  noise  in  time  domain;  (b)  Power  spectrum;  (c)  The  mean  of  power 
spectrum;  (d)  The  variance  of  power  spectrum. 
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The  waveform  of  AR  noise 

“I 1 r 


-61 1 1 1 1 1 

0 200  400  600  800  1000  1200 

TIME  (o^=2,  N=1024) 

(a) 


The  mean  of  the  integrated  spectrum 


(c) 


The  integrated  spectrum  (one  trial) 


(b) 


(d) 


Figure  2.8:  The  integrated  power  spectrum  of  AR  noise  ( N = 1024,  cr2  = 2, a = 0.6). 
(a)  Waveform  of  AR  noise  in  time  domain;  (b)  The  power  spectrum;  (c)  The  mean  of  the 
power  spectrum;  (d)  The  variance  of  the  power  spectrum. 
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From  Eq.  2.23,  we  can  see  that  the  numerator  and  denominator  are  two  dependent 
variables.  It  is  not  simple  to  find  the  distribution  of  the  ratio  spectrum  F(up),  as  discussed 
in  [4],  However,  the  bounds  or  confidence  levels  can  be  used  to  quantify  the  distribution. 
One  bound  can  be  found  in  page  484  of  [4],  We  have 

o™%  ~ ~ a (77  JQ  H2(x)dx )*  (2.24) 

with  asymptotic  confidence  level  7 = A (a),  where  H(x)  can  be  viewed  as  a transform 
function  given  in  Eq.  2.15,  and 

OO 

A (a)  = (— 1 )Je~2aV  (2.25) 

j=- OO 

From  Eq.  2.24,  the  bound  is  inversely  proportional  to  the  square  root  of  N.  As  a result,  the 
bound  is  zero  if  N is  infinitely  large. 

Another  relevant  bound  is  the  confidence  interval  for  CDF  (cumulative  density 
function),  which  can  be  found  in  page  251  of  [23].  Here,  we  only  introduce  the  con- 
clusion. For  a specific  x,  let  nx  be  the  number  of  27 ’s  that  do  not  exceed  x and  n is  the  total 
number.  Then,  the  CDF  estimate  is  given  by 

F(x)  = ^ (2.26) 

n 

and  the  interval  is  given  by 

d=^ \J F(x)(l  - F(x))  (2.27) 

where  zu  is  the  quantity  corresponding  to  confidence  coefficient  7 = 2u  — 1.  We  can  thus 
claim  with  confidence  coefficient  7 that  the  unknown  F(u)  is  within  the  interval.  Note  that 
the  length  d of  this  interval  depends  on  x and  n.  The  variance  is  small  if  n is  large.  We 
shall  now  find  an  interval  estimate  F{x)  + c of  F(x)  where  c is  a constant.  The  constant  c 
is  such  that 


P(|F(x)  - F(x) | < c)  = 7 


(2.28) 
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To  find  c,  we  form  the  maximum  value 


s = max  |F(x)  — F(x)  | 


(2.29) 


It  is  not  simple  to  find  Fs(s ) (cumulative  distribution  of  ,s).  We  give  next  an  ap- 
proximate solution  due  to  Kolmogoroff:  For  large  n, 


Fs(x ) fzs  1 — 2e 


-2  nx2 


(2.30) 


From  this,  it  follows  that  7 = Fs(c)  = 1 - 2e~Wc\  We  can  thus  claim  with  confidence 
coefficient  7 that  the  unknown  F(u)  is  between  the  curves  F(u)  + c and  F(u)  - c where 


c = t/-^~ln- — 1 
2 n 2 


(2.31) 


After  introducing  the  previous  work,  we  next  derive  the  approximate  distribution 
of  the  ratio  spectrum  F(u).  Eq.  2.23  can  be  rewritten  as 

1 


FM  = 


1 1 Zi=v+1P(u>i) 
E UiPM 

1 


(2.32) 


1 + £ 
y 


Let  z — |,  then  the  PDF  (probability  density  function)  of  z is 

/oo 

\y\fxV(zy,y)dy 

•OO 

/OO 

\v\M*v)fv(v)dy 

•OO 


(2.33) 


where  the  fx  and  fy  are  the  PDF  of  x and  y respectively  and  have  been  discussed.  Further, 
let  r — F(ujp)  for  simplicity.  Then,  the  PDF  of  the  ratio  spectrum  is  given  by 


Mr)  = 


(2.34) 


From  the  above  equation,  we  can  see  the  distribution  is  complicated  and  it  is  im- 
possible to  obtain  a closed-form  solution.  However,  we  know  that  the  integrated  spectrum 
estimate  is  unbiased  with  variance,  inversely  proportional  to  the  data  length  N.  We  next 
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derive  the  mean  and  variance  of  the  ratio  spectrum  (second-order  feature  of  distribution). 
We  can  rewrite  the  Eq.  2.23  as 

ELj  P(u,) 


FM  = 


yN 

jq 

X 


e:=i  pm 


x + y 


(2.35) 


where 


x 


= v t 


(2.36) 


and 


9=1 


N 


pM 


(2.37) 


q=p+l 


Obviously,  x and  y are  two  independent  random  variables.  From  the  previous  discussion, 
the  x and  y have  the  mean  and  variance  given  by 

1 k 

(2.38) 


9=1 


and 


var[x)  = 


(2.39) 


Similarly, 


9=1 


N 


m = jf  E pM 


(2.40) 


and 


q=k+ 1 


N 


var\ 


^ E p2K) 


(2.41) 


q=fc+l 


Let  x and  y denote  the  mean  of  x and  y respectively  for  simplicity,  and  two  dimen- 
sion Taylor  expansion  is  used.  If  F(ujp)  is  expanded  around  the  point  (x,  y),  then, 

\ _ x dF(up)  dF((jjv) , 

P(Up)  ~ X + V+  dx  l(i=x’l'=y)(;r  _ X)  I (x=x,y=y){y  ~ y) 


x + y 


22 


+M 

x 


+ 


X 


+ v (x  + y) 


7{x-x) 


X 


X 


+ 


y x — x 


(x  + y)2 
x y-y 


( y-y)  + M 


+ M 


, , (2.42) 

x + y x+yx+y  x+yx+y 

where  M stands  for  the  terms  with  the  orders  larger  than  1.  It  is  reasonable  to  ignore 
the  terms  with  order  larger  than  1 since  the  variance  x and  y are  very  small,  inversely 
proportional  to  N.  Actually,  even  if  (x,  y)  is  not  close  to  the  mean  (x,  y),  we  can  show  that 
M is  much  smaller  than  the  terms  with  first  order  in  the  expansion.  As  a result,  we  still 
have  good  accuracy  after  M is  removed.  To  show  it,  let  us  look  at  one  of  the  second  order 
terms  M2,  M2x  given  by 

82F(“r)> 


Mox  — 


dx 2 

y ,x  — x 


\(x=x,y=y)(£  x) 


— zy 

x+y  x+y 
V 2 


x + y 


(2.43) 


where  r\  = Similarly,  we  have 


M2y  - 


d2F(up) 


\(x=x,y=y){il  2/) 


X 


y~yf 


x+y  x+y 


x + y 


(2.44) 


where  r2  = We  can  easily  show 


E[ri]  = 0 


(2.45) 


and 


var[ri]  = 


< 


var(x) 

(x  + y)2 

(fon  H2(co)du)2 
N 


(2.46) 
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Similarly,  E[r2\  = 0 and  var[r2]  < 1/N. 

From  the  above  derivation,  very  likely,  rx,r2  « 1.  Asa  result,  the  second  order 
term  M2  is  much  smaller  than  the  first  order  term  Mx.  Similarly,  the  derivation  can  be 
repeated,  and  other  terms  can  be  examined.  As  such,  we  can  just  take  first-order  terms 
and  remove  high  order  terms  with  good  accuracy.  Also,  if  N increases,  then  variance  of  r 
decreases  and  approaches  zero.  So,  the  accuracy  is  proportional  to  N.  As  a result,  Eq.  2.42 
becomes 


(x+y)2 


(2.47) 


Consequently,  from  Eq.  2.47,  the  mean  of  the  ratio  is  given  by 


£[fH)]  = 


X 


+ 


x + y {x  + y)2 
= f(“p) 

and  the  variance  of  ratio  spectrum  is  given  by 


E[x  — x] 


x 


(x  + y ): 


;E[y  - y] 


var{F{up ))  = 


V 


(x  + y) 


-var{x) 


x 


(x  + y) 


:var( 


If  using  mx  and  my  denote  the  means  of  x and  y,  then 

var{F{up ))  = 


— var{x)  + 


{mx  + my)A^' ' {mx  + my)A 
(1  - F((jp))2var{x ))  + F2{uv)var{y) 
{mx  + rriy)2 


var{y ) 


(2.48) 


(2.49) 


(2.50) 


where  mx,  my,  var(x)  and  var{y ) in  Eq.  2.50  are  given  in  Eq.  2.38,  Eq.  2.40,  Eq.  2.39 
and  Eq.  2.41  respectively.  From  the  above  equation,  we  can  see  that  variance  of  the  ratio 
spectrum  is  inversely  proportional  to  N since  the  var{x ) and  var{y ) both  are  inversely 
proportional  to  the  N.  Besides, 


m: 


lim  var{F{(jjp))  = . . . 

JV-»°o  (mx  + my)AN 

= 0 


lim  var(x ) + 


mi 


{mx  + my)A  n 


lim  var{y) 


(2.51) 
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Based  on  above  analysis  for  the  mean  and  variance  of  ratio  spectrum,  we  conclude 
that  the  ratio  spectrum  estimate  is  consistent. 

Also,  the  variance  of  F(ujp))  can  be  written  in  another  form.  If  the  input  signal  is 
given  in  Eq.  2. 13,  it  is  easy  to  show 


,,  1(1-  F(uy))2  C \H(u)\*du,  + F2( Up)  £ | H(u,)\‘du, 

var[F[uD ))  = s. n 

N (/*  \H(u})\2dwY  { ) 

From  the  above  equation,  the  variance  is  independent  of  a 2,  and  is  inversely  proportional 
to  N . 


As  an  example,  if  the  input  is  white  Gaussian  noise  with  variance  cr2,  then 


mx  = a2k/N 


(2.53) 


and 


var(x ) = kcr^/N2  (2.54) 

Similarly,  we  get 

my  = <t2(N  - k)/N  (2.55) 

and 


var(y ) = (N  — k)^  /N2 


(2.56) 


Replacing  the  mx,my,var(x ) and  var(y ) in  Eq.  2.50,  we  obtain  the  variance  of  the 
ratio  spectrum 


var(F(LUp)) 


F(uP)(  1 ~ F(up)) 
N 


(2.57) 


This  result  is  similar  to  that  given  in  Eq.  2.27.  From  Eq.  2.47,  it  is  not  hard  to 
derive  the  distribution  of  ratio  spectrum  since  the  RS  is  the  linear  combination  of  two 
integrated  spectrum  x and  y having  approximate  Gaussian  distribution.  Consequently,  the 
RS  is  approximately  a Gaussian  distribution. 
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To  verify  the  above  expressions  derived  for  the  mean  and  variance  of  the  ratio 
spectrum,  we  show  simulation  results.  In  the  simulation,  we  obtain  the  mean  and  variance 
by  averaging  1000  trials.  In  the  first  simulation,  the  input  signal  is  Gaussian  white  noise. 
To  see  the  dependence  of  the  variance  of  RS  on  cr2,  we  use  a2  = 1 and  a2  = 2 respectively, 
with  length  N = 1024  unchanged.  The  mean  and  variance  are  shown  in  Fig.  2.9(a)(b)  with 
cr2  = 1 and  in  Fig.  2.9(c)(d)  with  cr2  = 2 respectively.  We  observe  that  the  simulation 
results  are  very  close  to  the  analytical  ones,  and  independent  of  a2.  Also,  we  analyze  the 
relationship  between  variance  and  length  N.  Fig.  2.10  shows  the  results  for  the  case  with 
N = 2048  other  than  N = 1024.  From  the  figure,  we  can  see  that  the  variance  is  cut 
in  half  compared  with  N = 1024,  which  is  expected.  In  the  second  experiment,  we  use 
AR(1)  noise,  and  we  change  N,  a,  and  a2.  Fig.  2.11  shows  the  results  with  a = 0.3, 
N — 1024,  while  Fig.  2.12  shows  the  results  with  a — 0.3,  N = 2048.  The  result  with 
a = 0.7  is  shown  in  Fig.  2.13.  All  results  in  the  simulations  agree  with  the  derivation  given 
in  Eq.  2.50.  Also,  the  variance  of  the  ratio  spectrum  is  independent  of  cr2,  and  is  inversely 
proportional  to  N. 

2.5  Power  Distribution  by  Uniformly  Sampling  the  Ratio  Axis 

In  many  signal  processing  applications  such  as  coding,  de-noising,  and  detection, 
it  is  critical  to  represent  a signal  with  the  fewest  number  of  significant  components.  Gen- 
erally, the  Fourier  transform  is  a reasonable  representation  for  periodic  signals,  while  the 
wavelet  transform  is  good  for  representing  transient  signals.  In  this  section,  we  show  that 
signal  power  components  are  enhanced  or  noise  components  are  reduced  by  using  the  RS. 

Consider  a signal  with  a few  dominant  components  in  the  power  spectrum  and  that 
the  Fourier  Transform  is  appropriate.  A simple  example  is  the  power  spectrum  of  a voiced 
phoneme  shown  in  Fig.  2.14(a)  where  there  are  approximately  three  dominant  frequency 
bands  with  variable  center  frequency  and  bandwidth,  and  the  remaining  frequency  compo- 
nents contain  little  information.  For  simplicity  and  without  loss  of  generality,  consider  one 
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Mean  of  the  RS  for  white  noise 


(a) 


Mean  of  the  RS  for  white  noise 


(b) 


(c) 


(d) 


Figure  2.9:  The  mean  and  variance  of  the  ratio  spectrum  for  white  noise  (N  = 1024).  (a) 
The  mean  of  the  RS  with  a2  = 1;  (b)  The  variance  of  the  RS  with  a2  = 1;  (c)  The  mean  of 
the  RS  with  a2  — 2;  (d)  The  variance  of  the  RS  with  a2  = 2. 
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(a) 


Mean  of  the  RS  for  white  noise 


(b) 


(c) 


(d) 


Figure  2.10:  The  mean  and  variance  of  the  ratio  spectrum  for  white  noise  (N  = 2048).  (a) 
The  mean  of  the  RS  with  a2  = 1;  (b)  The  variance  of  the  RS  with  a2  = 1;  (c)  The  mean  of 
the  RS  for  with  a2  = 2;  (d)  The  variance  of  the  RS  with  a2  = 2. 
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(a) 


(b) 


(c) 


(d) 


Figure  2.11:  The  mean  and  variance  of  ratio  spectrum  for  AR  noise  ( N = 1024,  a = 0.3). 
(a)  The  mean  of  the  RS  with  o\  = 1;  (b)  The  variance  of  the  RS  with  a\  = 1;  (c)  The  mean 
of  the  RS  with  aj  — 2;  (d)  The  variance  of  the  RS  with  = 2. 
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(a) 


(b) 


(c) 


(d) 


Figure  2.12:  The  mean  and  variance  of  the  ratio  spectrum  for  AR  noise  (N  = 2048, 
a = 0.3).  (a)  The  mean  of  the  RS  with  a2  = 1;  (b)  The  variance  of  the  RS  with  a2  = 1; 
(c)  The  mean  of  the  RS  with  a2e  = 2;  (d)  The  variance  of  the  RS  with  a\  = 2. 
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(a)  (b) 


(c) 


(d) 


Figure  2.13:  The  mean  and  variance  of  the  ratio  spectrum  for  AR  noise  (cf?  = 2,  a = 0.7). 
(a)  The  mean  of  the  RS  with  N = 1024;  (b)  The  variance  of  the  RS  with  N = 1024;  (c) 
The  mean  of  the  RS  with  N = 2048;  (d)  The  variance  of  the  RS  with  N = 2048. 
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of  the  high-power  bands  in  the  spectrum  and  assume  that  the  power  outside  this  band  is 
very  small.  Let  fm  denote  the  sampling  frequency,  Bt  bandwidth,  Psi  power  of  this  band, 
Pt  total  power,  and  NT  the  total  data  length.  Then  the  number  of  points  representing  this 
band  in  terms  of  the  power  spectrum  (using  the  FFT)  is  approximately: 


Nt 

Nfi  = -j^B, 


(2.58) 


The  band  is  assigned  samples  based  on  their  bandwidth  alone. 

However,  if  we  sample  the  signal  uniformly  with  NT  point  along  the  ratio  axis 
(vertical  axis)  of  the  RS  shown  in  Fig.  2.14(b),  we  then  sample  the  frequency  axis  non- 
uniformly  with  samples  falling  at  regions  of  high  energy.  The  number  of  points  falling  into 
this  band  is: 


Clearly,  the  number  of  samples  falling 
band. 


Nr 

Nri  = -5- Psi 

Br 


within  this  band  is 


(2.59) 

proportional  to  the  power  of  this 


Let  us  denote  Pa  the  signal  average  power  density,  PaBi  the  average  power  density 
of  this  band.  Then 


(2.60) 


and 


p . 

PaBi  = (2.61) 

We  define  an  enhancement  rate  factor  G as  the  ratio  of  the  number  of  sampled  data 
falling  into  this  band  obtained  by  sampling  the  RS  along  ratio  axis  to  the  original  number 
of  samples.  Then 


G = 


AC 


N 


fi 


fmPsi 

PtB 


aBi 


(2.62) 


We  can  see  the  enhancement  rate  is  proportional  to  fm  and  and  1 / Bt.  Most  samples 

fall  on  the  components  with  power  density  larger  than  the  average  power  density  PT/fm- 
On  the  contrary,  the  components  with  power  density  less  than  average  power  are  reduced. 
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For  a purely  sinusoidal  signal,  G = Nt-  An  estimate  of  a practical  speech  signal  might  be 
Ps  = 0.5 Pt,  B = 100 Hz,  fm  = 16 kHz.  In  this  case,  the  enhancement  rate  is  G — 80. 
The  high-energy  bands  are  oversampled  by  a factor  of  G. 

For  K bands,  the  available  compression  rate  is  given  by: 


a=NL=  = /m  EL  P« 

Nt  %B,  Pt  EZ,  B, 


(2.63) 


Equivalently,  if  the  total  number  of  samples  is  NT/G  rather  than  NT,  then  all  signal  in  the 
band  is  sampled,  with  less  samples  falling  on  the  region  with  smaller  power  density.  That 
is: 

Nr  = Nf  (2.64) 


To  see  the  effect  of  non-uniform  sampling,  one  result  of  the  simulation  is  shown 
in  Fig.  2.14.  Fig.  2.14(a)  shows  a typical  power  spectrum  of  a speech  signal,  while 
Fig.  2.14(b)  shows  its  ratio  spectrum.  There  are  three  dominant  bands  in  the  spectrum 
with  the  same  bandwidth,  and  the  power  of  the  first  band  with  central  frequency  1000Hz  is 
twice  that  of  second  band  with  central  frequency  2500Hz,  and  four  times  the  power  of  the 
third  band  with  central  frequency  4000Hz.  In  the  simulation,  fa  = 20000 Hz,  Bt  = 200 Hz, 
Nt  — 1536,  using  Eq.  2.63,  we  get  G ss  30  and  Nr  = N^/G  = 50.  The  spectrum  obtained 
by  uniformly  sampling  the  ratio  axis  with  NT  = 200  points  is  shown  in  Fig.  2.14(c).  The 
frequency  location  distribution  of  the  sampled  data  is  shown  in  Fig.  2.14(d).  We  can  see 
that  around  50%,  25%  and  12.5%  of  samples  are  in  band  1 , band  2 and  band  3 respectively, 
which  agree  with  what  we  expect. 

Fig.  2.15  is  similar  to  Fig.  2.14  except  Nt  = 50.  We  can  see  that  all  signal  infor- 
mation except  other  components  with  very  small  power  density  is  preserved.  From  the  two 
figures,  we  observe  that  the  number  of  samples  is  reduced  significantly.  Consequently,  we 
can  use  50  samples  to  represent  the  spectrum  rather  than  NT  samples. 

In  many  applications,  the  high  power  density  components  tend  to  represent  signal 
features.  Subsequently,  lower-energy  bands  tend  to  represent  noise  or  less  important  signal 
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Power  spectrum  by  uniform  sampling  in  ratio  axis 


frequency  of  sampled  data(Hz) 


(c) 


Ratio  spectrum 


frequency(Hz) 

(b) 

Uniform  sampling  in  ratio  axis 
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Figure  2.14:  Nonuniform  sampling  of  power  spectrum  NT  = Nm  = 200.  (a)  Power 
spectrum;  (b)  Ratio  spectrum;  (c)  the  power  spectrum  obtained  by  uniformly  sampling  the 
ratio  axis;  (d)  the  ratio  spectrum  obtained  by  uniformly  sampling  the  ratio  axis. 
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Figure  2.15:  Nonuniform  sampling  of  power  spectrum.  NT  = Nm  = 50  (a)  Power  spec- 
trum; (b)  Ratio  spectrum;  (c)  the  power  spectrum  obtained  by  uniformly  sampling  the  ratio 
axis;  (d)  the  ratio  spectrum  obtained  by  uniformly  sampling  the  ratio  axis. 


35 


information  and  we  speculate  that  we  can  use  fewer  frequency  samples  to  represent  this 
information.  Since  most  of  the  information  is  located  in  the  bands  with  high  power  density 
and  little  information  is  located  in  bands  with  small  power  density,  the  RS  nonuniform 
sampling  tends  to  place  more  samples  in  regions  of  high  information  and  thus  increase  the 
signal  to  noise  ratio  (SNR).  This  increase  in  SNR  directly  leads  to  an  increase  in  detection 
probability  as  is  discussed  in  chapter  3. 


CHAPTER  3 

APPLICATIONS  IN  DETECTION  AND  PARAMETER  ESTIMATION 

3.1  Introduction 

In  chapter  2,  we  showed  that  the  conventional  power  spectrum  is  an  inconsistent 
estimate,  while  the  integrated  spectrum  and  the  RS  produces  a consistent  estimate.  In 
this  chapter,  we  will  show  that  this  property  can  be  used  in  parameter  estimation.  Unlike 
most  parameter  estimators  based  in  the  time  domain,  the  RS  based  parameter  estimator 
operating  in  frequency  domain  may  produce  better  performance  in  many  situations  where 
the  processing  in  frequency  domain  is  necessary. 

In  addition,  we  have  shown  that  most  samples  fall  into  regions  with  high  power  den- 
sity if  we  uniformly  sample  the  spectrum  along  the  ratio  axis.  As  a result,  the  components 
with  high  power  density  in  the  power  spectrum  are  enhanced  while  those  components  with 
low  power  density  are  reduced,  which  can  be  used  in  detection  problem.  The  proposed  al- 
gorithm will  be  compared  with  other  widely-used  algorithms  such  as  GLRT  and  wavelets. 
In  this  chapter,  we  will  discuss  these  applications. 

3.2  Application  in  Detection  of  Unknown  Signal 

The  detection  of  signals  or  targets  find  applications  in  many  areas.  When  a signal 
waveform  is  known,  then  the  optimal  detector  is  the  likelihood  ratio  test  (LRT)  which  is 
usually  implemented  with  a matched  filter.  To  achieve  the  optimal  detection  performance, 
the  phase  information  and  Gaussian  white  noise  have  to  be  assumed.  If  one  of  these  as- 
sumptions is  not  met,  the  detection  performance  will  be  degraded.  When  a signal  to  be 
detected  is  unknown,  the  most  widely  used  method  is  the  generalized  likelihood  ratio  test 
(GLRT)  where  the  maximum  likelihood  (ML)  estimation  of  the  unknown  signal  in  LRT 
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may  be  used  as  proposed  in  [24],  [2],  The  literature  discusses  numerous  detection  algo- 
rithms that  require  various  kinds  of  information  about  the  signal,  such  as  the  arrival  time, 
bandwidth,  type  of  signal,  or  the  transient  nature  [25]-  [27].  In  other  words,  we  can  find 
the  optimum  detection  method  on  the  signal  given  various  constraints  on  the  signal  itself. 

In  recent  years,  a lot  of  attention  has  been  devoted  to  the  detection  of  arbitrary  sig- 
nals based  on  a transform  representation.  For  example,  the  linear  transformations  include 
the  Gabor  filter  and  wavelet  representations,  while  the  nonlinear  transformations  include 
the  Wigner-Ville  distribution  and  the  ambiguity  function.  One  common  feature  of  these 
detection  techniques  is  the  resulting  data  reduction.  Many  of  these  detection  algorithm  are 
well-suited  to  the  detection  of  a particular  type  of  signal.  A performance  study  for  many 
different  detection  algorithms  can  be  found  in  [26],  [28].  The  most  robust  algorithms  tend 
to  be  those  with  the  fewest  assumptions,  or  else  those  with  assumptions  that  are  typically 
well-satisfied. 

In  this  section,  we  will  first  introduce  the  GLRT  algorithm,  then  discuss  the  detec- 
tion algorithm  based  on  wavelet  thresholding,  and  finally  propose  new  algorithms  based 

on  the  RS  for  the  detection  of  unknown  signal. 

3.2,1  GLRT  Detection  Algorithm 

Consider  the  following  detection  problem:  H0  : r — n and  Hi  : r = s + n, 
where  the  n,  s and  r are  the  noise,  signal  and  received  signal  vector  (with  dimension  N) 
respectively.  The  GLRT  will  be: 


A(r)  = 


max*  fr/Hi  (r/ H\) 


(3.1) 


max0o  fr/H0(r/H0 ) 

where  and  9(i  are  the  parameters  to  be  estimated.  The  test  is  compared  with  a threshold 
A,  decide  Hi  if  A > A;  decide  Hq  if  A < A.  If  the  noise  is  white  Gaussian  and  the  signal  is 
unknown,  then  ML  may  be  used  to  estimate  it. 

Under  Hi,  the  pdf  of  vector  r may  be  written  as 

1 


fr/Hxir)  = 


nN/2aN 


(r  — s)T{r  — s) 
exp[ -] 


(3.2) 
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Under  H0,  the  pdf  of  vector  r may  be  written  as 

1 rTr 

fr/Ho(r)  = ~^N/2^N  eXP[ (3-3) 

Then,  maximizing  fr/H x (r)  with  respect  to  s produces 


SML  = r (3.4) 

That  is,  the  signal  to  be  estimated  is  the  noisy  signal  if  we  do  not  have  more  knowledge  of 
the  signal.  Consequently,  the  GLRT  is  reduced  to  the  test  variable 


A = 


(3.5) 


Under  H0, 


A = nTn 


(3.6) 


where  A is  chi-square  distribution  random  variable  with  N degrees  of  freedom.  Under  Hi 

A = (s  + n)T(s  + n)  (3.7) 


where  A is  noncentral  chi-square  distribution  with  N degrees  of  freedom  and  noncentrality 
parameter: 

v = sTs  (3.8) 

The  false  alarm  rate  Pp  can  be  expressed  as: 

/>oo  N 

Pf=  h(rlHa)ir  = e-xY,-rr  (3.9) 

U k'- 

where  A is  the  threshold  which  is  associated  with  PF,  and  f\(r/H0 ) is  the  PDF  of  the  test 
under  H0.  The  detection  probability  can  be  written  as: 

roo 

Pd=z  J f\(r/Hi)dr  = Qm(v?,  A1/2)  (3.10) 

where  f\(r/Hi ) is  the  PDF  of  the  test  variable  under  Hi,  and  QM  is  the  generalized 
Marcum  Q-fimction  [2], 
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From  the  expressions  of  PF  and  PD,  we  can  see  that  it  is  hard  to  obtain  closed- 
form  solution  for  them.  However,  the  detection  probability  Pd  is  a monotonically  increas- 
ing function  of  the  noncentrality  v (for  a fixed  number  of  degree  of  freedom  TV).  Or  PD 
increases  if  N is  reduced  for  fixed  v,  which  is  equivalent  to  data  reduction.  In  practice, 
the  ROC  (Receiver  Operating  Characteristic)  relationship  is  used  to  evaluate  the  detection 
performance.  One  of  the  ROC  curves  is  shown  in  Fig.  3.4(b),  where  the  horizontal  axis 
and  the  vertical  axis  stand  for  the  false  alarm  rate  PF  and  detection  rate  PF  respectively 
given  a fixed  SNR.  Another  widely  used  measurement  for  detection  performance  is  the 
relationship  between  PD  vs.  SNR  given  a fixed  false  alarm  rate  PD.  An  example  of  PD 
vs.  SNR  is  shown  in  Fig.  3.4(c), 

The  GLRT  is  the  simple  energy  test  if  we  have  no  other  knowledge  except  the 
Gaussian  white  noise  with  constant  variance.  However,  if  we  have  other  knowledge  of  the 
signal,  the  GLRT  may  be  modified  to  improve  the  detection  performance  as  is  discussed  in 
[28], 

3.2.2  Detection  Algorithm  Using  Wavelet 

In  recent  years,  wavelet  analysis  has  become  a powerful  analyzing  and  synthesizing 
tool  for  many  applications.  The  ability  of  wavelets  to  resolve  different  scales  and  trans- 
form information  among  these  scales  has  been  successfully  applied  to  signal  processing, 
data  compression,  and  de-noising.  For  many  problems  in  signal  and  image  processing  the 
observed  data  are  influenced  by  noise  and  the  question  arises  of  how  to  best  estimate  the 
underlying  signal  from  noisy  observations.  An  important  step  in  solving  this  problem  was 
achieved  by  the  work  of  Donoho  and  Johnstone  using  the  wavelet  transform  of  the  noisy 
data.  They  studied  the  situation  of  detecting  a signal  in  additive  Gaussian  white  noise.  In 
particular,  the  basic  idea  involved  special  thresholding  schemes  for  noisy  wavelet  coeffi- 
cients using  properties  of  orthogonal  wavelet  transforms  and  Gaussian  random  variables. 
Using  the  inverse  wavelet  transform,  an  estimate  for  the  original  signal  was  obtained  [29]- 
[32]. 
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In  the  wavelet  literature,  there  are  many  successful  applications.  Coding  and  de- 
noting are  two  of  the  most  famous  wavelet  applications.  However,  the  efficiency  of 
wavelet  denoising  critically  depends  on  the  type  of  original  signal  to  be  denoised  and  the 
particular  wavelet  and  algorithm  used.  Donoho  proved  that  wavelet  shrinkage  or  the  thresh- 
olding approach  is  near  optimal  in  terms  of  MSE  (and  optimal  in  the  mini-max  sense)  if 
the  signal  is  piecewise  smooth  and  noise  is  white  Gaussian. 

We  first  introduce  two  conclusions  in  using  wavelets  thresholding,  before  we  pro- 
pose a new  detection  algorithm  using  wavelets  similar  to  the  denoising  algorithm. 
Thresholding  Wavelet  Coefficients  of  Noisy  Signal 


Let  y(n),  f(n),  and  w(n)  be  the  noisy  signal,  the  original  clean  signal,  and  white 
Gaussian  noise,  respectfully.  Then  y(n)  = f(n)  + w(n),  1 < n < N.  Suppose  that  / 
is  a piecewise  polynomial  over  [0,  N — 1],  It  is  equal  to  polynomial  pi  of  degree  d on  K 
successive  intervals  [r*,  r;+i]  and  can  be  written. 

K- 1 

/(n)  = ^PiWtn.n+M  (3.H) 

1=0 

where  I[Tl,Tl+1]  is  an  appropriate  rectangular  windowing  function.  We  cite  the  following 
theorems  originally  proved  by  Donoho  [29]. 

Theorem  1 [29]:  For  a threshold  T = <j\J2logeN , the  average  approximation  error 
for  N > 4 satisfies 

e = E{\\f-Y\\ 2} 

N— 1 

< (21ogeN  + 1)  [o-2  + min(l  < f,  gm  > |2,  o-2)] 

m=0 

(3.12) 


where  gm  is  the  vector  of  an  orthogonal  basis. 

Theorem  2 [29]:  Suppose  that  / is  a piecewise  polynomial  over  [0,  iV  — 1],  a 
hard-thresholding  estimator  Y calculated  with  a Daubechies  wavelet  with  d + 1 vanishing 
moments  satisfies: 


my  - / 12] 


k(d  + 1)  Gloge 2 N 
P 2 


2 


N 


(3.13) 
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with  C = |^~2  > and  p2  = (signal  to  noise  ratio),  and  k the  number  of  break  points. 

From  Eq.  3.12,  the  wavelet  shrinkage  method  produces  the  small  error  when  most 
coefficients  | < /,  gm  > |2  are  less  than  a2.  In  other  words,  after  the  wavelet  transform, 
there  are  a few  large  coefficients  while  most  of  the  coefficients  are  close  to  zero.  In  this 
situation,  most  noise  is  filtered,  thus  the  wavelet  thresholding  is  efficient. 

Also,  from  the  best  basis  point  of  view,  the  best  basis  is  selected  to  obtain  the 
minimum  cost  function.  The  entropy  cost  function  [33],  [34]  is: 

N I < f-  O™  > I2  I < f,9m  > I2 

J y 1 (3.14) 


C</-BJ  = -£^T7F^‘ogJ 

where  Ba  is  a basis,  and 

0 < C(/,  Ba)  < logeN 


(3.15) 


It  reaches  an  upper  bound  for  a diffusing  basis,  and  the  lowest  bound  for  only  one  non- 
zero decomposition.  The  cost  function  is  small  only  when  there  exist  a few  dominant 
decomposition  coefficients. 

Furthermore,  Eq.  3.13  shows  the  effect  of  k (number  of  break  points).  If  k is 
large,  then  the  error  is  large  or  the  SNR  is  small.  For  example,  for  transient  signals  such  as 
speech,  the  signal  is  not  piecewise  smooth  but  oscillating-there  exist  many  transient  points. 
Therefore,  there  are  many  wavelet  coefficients  with  large  values,  that  is  , (|  < /,  gm  > |2  > 
cr2).  As  a result,  min(l  < f ,9m  > \2,&2)  is  very  large.  So  the  upper  error  bound 

indicated  in  Eq.  3.12  and  Eq.  3.13  is  large,  and  the  wavelet  thresholding  is  not  efficient  in 
the  particular  case  of  speech  signals. 

In  summary,  for  a noisy  signal  where  the  signal  is  piecewise,  the  near  optimal  MSE 
can  be  obtained  if  the  wavelet  thresholding  algorithm  is  used.  SNR  can  be  increased  by  a 
order  of  \m0^  N-  The  conclusions  are  the  fundamental  for  our  proposed  detection  algorithm 
to  be  discussed. 


Figure  3.1 : The  block  diagram  of  detection  using  WT. 


43 


Detection  Algorithm  Using  Wavelets 

The  block  diagram  of  the  detection  algorithm  using  the  wavelet  transform  is  shown 
in  Fig.  3.1.  The  input  signal  x(k)  is  transformed  to  the  wavelet  domain  with  coefficients 
y{k).  If  y(k)  is  larger  than  a threshold  T given  in  Eq.  3.12,  then  we  keep  it  unchanged, 
i.e.,  Z (k)  = y(k),  while  seting  Z(k)  to  zero  if  it  is  less  than  T.  Therefore,  we  obtain  the  Z 
vector.  Using  an  energy  detector,  we  form  the  test  variable  A = ZTZ.  Then  we  compare 
the  test  variable  with  a threshold  Tj  set  depending  on  false  alarm  probability.  If  the  test  is 
larger  than  the  threshold  Tf,  decide  Hu  decide  H0  otherwise. 

Based  on  the  conclusions  in  the  last  section,  wavelets  are  most  appropriate  when 
there  are  only  few  dominant  coefficients  while  the  others  are  small.  After  thresholding, 
most  noise  will  be  removed  while  the  signal  is  almost  unchanged.  Equivalently,  data  size  is 
reduced,  which  results  in  an  increase  in  SNR.  Approximately,  the  SNR  of  the  test  variable 
increases  by  an  order  of  However,  in  the  algorithm,  we  assume  that  we  know  the 

variance  of  the  noise  and  that  the  signal  is  piecewise  smooth  specified  by  the  two  theorems 
discussed  above. 

Simulations  show  that  the  algorithm  is  very  efficient  for  piecewise  polynomial  sig- 
nals but  bad  for  transient  signals  in  the  simulation  as  expected.  Fig.  3.2(b)  shows  the  result 
(ROC),  when  the  input  shown  in  Fig.  3.2(a)  is  of  Gaussian  shape  with  SNR  = -10 dB.  It 
is  well  known  that  many  targets  or  objects  can  be  approximately  modelled  as  a Gaussian 
waveform.  The  ROC  shows  that  performance  using  the  proposed  wavelet  thresholding 
technique  is  much  better  than  that  of  conventional  GLRT.  However,  in  another  example, 
the  input  signal  is  the  three  damped  sinusoids  shown  in  Fig.  3.3(a),  which  is  widely  used 
in  test  of  transient  signal  detection  algorithm.  The  ROC  for  the  input  signal  is  shown  in 
Fig.  3.2(b).  We  observe  that  the  performance  using  the  proposed  wavelet  thresholding  is 
much  worse  than  that  of  conventional  GLRT.  The  results  of  the  two  simulations  for  both 
Gaussian  sharp  and  damped  sinusoids  signal  agree  with  our  expectation. 
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Therefore,  we  conclude  that  the  proposed  wavelet  thresholding  is  efficient  for  piece- 

wise  smooth  signals  and  not  for  periodic  signals. 

3.2.3  Detection  Algorithm  Using  Ratio  Spectrum 

In  the  last  section,  we  proposed  a detection  algorithm  based  wavelet  thresholding, 
and  we  showed  it  is  efficient  for  piecewise  signals  and  not  efficient  for  transient  signals. 
In  this  section,  we  propose  a RS-based  detection  algorithm  that  is  efficient  for  bandlimited 
signals. 

GLRT  in  Fourier  Domain 

Let  Sf,  rif  and  r y denote  the  Fourier  transform  of  three  time  domain  signals  s,  n 
and  r respectively.  A denotes  the  N x N Fourier  matrix,  an  orthogonal  matrix, 
under  H0 

rs  = rif  = An  (3.16) 

the  test  variable  is 

A = njnf  (3.17) 

It  can  be  shown  that  rf(k ) (1  < k < N)  is  a complex  Gaussian  distribution  with  mean  0 
and  variance  (cr2/2,  cr2/2).  As  a result,  |ry(/c)|2  is  the  x2(2)  distribution,  with  2 degrees  of 
freedom,  and  the  A is  a y?(2N)  variable.  The  pdf  of  A is: 


fH0(y ) = 

l 

cr2N2N(N  ■ 

- l)!y"  >exp(  ZJ 

(3.18) 

A has  the  mean  and  variance  of 

E(  A) 

= Na 2 

(3.19) 

and 

°A  = 

2Na4 

(3.20) 

under  Hi 


rf  = A(s  + n)  = Sf  + An 


(3.21) 
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Gaussian  waveform 


ROC 


Figure  3.2:  Gaussian  example:  input  consists  of  a Gaussian  shape  signal  with  proposed 
wavelet  method  (x)  and  GLRT  (solid  line),  (a)  Gaussian  Waveform;  (b)  ROC. 
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Damped  sinusoid  waveform 


Figure  3.3:  The  example  of  three  damped  sinusoids:  proposed  wavelet  method  (x)  and 
GLRT  (solid  line),  (a)  damped  sinusoid  Waveform;  (b)  ROC. 
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and  the  test  variable  is 


A = rfrf  = (sf  + An)H(sf  + An) 


(3.22) 


where  rf(k)  (1  < k < N)  is  a complex  Gaussian  distribution  with  mean  sf(k)  and  variance 
(<72/2,  a2/ 2).  It  can  be  shown  that  \rf(k)\2  is  the  noncentral  x2  distribution  with  2 degrees 
of  freedom  and  noncentrality  parameter  v = \sf(k)\2.  As  a result,  the  A is  the  noncentral 
X2  distribution  with  2N  degrees  of  freedom  and  noncentrality  parameter  v = ||s/||2.  The 
pdf  of  A is: 


where  Ia(x)  is  the  ath  order  Bessel  function  of  the  first  kind, 
variance  : 


(yv)1/2 


a * 


(3.23) 


Also,  A has  the  mean  and 


E(  A)  = Na2  + v 


(3.24) 


and 


a2A  = 2,/Vcr4  + 4 a2v 


(3.25) 


It  is  hard  to  find  the  closed-form  solution  for  detection  rate  based  on  PDFs  given  by 
Eq.  3.18  and  Eq.  3.23  of  the  test  variable  A since  they  are  complicated.  Instead,  the  mean 
and  variance  of  A are  used  to  estimate  the  performance.  As  discussed  before,  the  detection 
probability  PD  is  a monotonically  increasing  function  of  the  noncentrality  v (for  a fixed 
number  of  degrees  of  freedom  N).  Or  PD  increases  if  N is  reduced  for  fixed  v,  which  is 
equivalent  to  data  reduction.  Consequently,  we  need  to  increase  the  SNR  to  improve  the 
detection  performance. 

Detection  Based  on  the  RS 


In  chapter  2,  we  show  that  we  can  enhance  the  signal  power  or  reduce  the  noise, 
which  equivalently  increases  the  SNR  if  we  uniformly  sample  the  ratio  axis  of  the  RS  . 
Motivated  by  this  property  of  the  RS,  we  propose  a new  detection  algorithm: 


1 . obtain  RS  of  the  received  signal 
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2.  uniformly  sample  its  ratio  axis  of  the  RS  with  Ns  points 

3.  add  all  sampled  power  components  obtained  by  the  sampling  to  form  the  new  detec- 
tion test. 

The  new  test  variable  becomes 


Mn)  = ihi2 


(3.26) 


where  rj(i)(  1 < i < Ns)  is  the  nonuniform  sampling  of  power  spectrum  of  the  received 
signal  r2(_7')(l  < j < N). 

In  the  last  section,  we  showed  that  the  strong  components  (with  higher  power  den- 
sity) can  be  enhanced  in  terms  of  more  samples  placed  by  uniformly  sampling  the  ratio 
axis.  This  operation  equivalently  increases  the  noncentrality  of  the  distribution,  or  in- 
creases the  SNR.  Therefore,  better  detection  performance  can  be  achieved.  The  quantity  of 
improvement  depends  on  the  structure  of  the  signal.  To  simplify  the  approximately  quan- 
titative analysis,  we  assume  there  is  only  one  signal  band  with  bandwidth  B and  power 
density  Ps/B  in  the  frequency  domain.  The  SNR  can  be  increased  in  the  average  or  mean 
sense  by: 


G = 


> 1 


(3.27) 


where  a2  is  the  variance  of  the  noise. 

From  Eq.  3.27,  we  can  see  G = 1 when  the  signal  is  a constant  over  the  whole 
frequency  range.  In  this  case,  the  SNR  is  unchanged  due  to  no  frequency  redundancy. 
Flowever,  the  maximum  SNR  gain  occurs  when  the  signal  is  a sinusoid  which  corresponds 
to  the  largest  frequency  redundancy,  in  this  case,  G = (1  + N fa)/ { 1 + -fa). 

In  this  way,  the  SNR  of  the  new  test  Aj  is  increased  approximately  by  a factor  of 
G.  A larger  G factor  leads  to  better  detection  performance.  Therefore,  the  algorithm  is 
more  efficient  for  a class  of  signals  with  large  frequency  redundancy,  such  as  speech  and 


49 


other  transient  acoustic  signals.  Also,  the  assumptions  made  are  more  easily  met  in  the 
real  world  than  the  algorithms  with  strong  constraints  such  as  known  locations  of  signal 
and  known  bandwidth  discussed  in  [26],  [28]. 

We  conclude  that  the  algorithm  is  efficient  for  signals  with  large  enhancement  fac- 
tors ( G ).  This  constraint  is  easily  met  by  many  kinds  of  signal  such  as  transient  acoustic 
signals  and  speech  signals.  Also,  we  can  expect  that  the  algorithm  is  more  robust  than 
many  others  since  the  assumptions  made  are  easily  met  in  the  real  world.  Unlike  most 

other  algorithms,  no  parameters  have  to  be  estimated  before  detection. 

Numerical  Simulations 

To  evaluate  the  RS-based  detection  algorithm,  we  demonstrate  its  performance  on 
three  examples.  We  first  use  a damped  sinusoid  and  a chirp  signal,  both  signals  are  widely 
used  in  the  detection  area  to  compare  detection  algorithms,  e.g.  see  [26],  [28],  Then,  we 
demonstrate  the  detection  performance  on  an  actual  speech  signal.  The  receiver  operating 
characteristic  (ROC)  and  detection  probability  with  signal  noise  ratio  (Pd  vs  SNR)  are 
computed  for  each  example  based  on  106  runs.  Each  signal  has  total  energy  1,  whereas 
the  additive  noise  is  white  and  Gaussian  with  zero  mean.  We  vary  the  noise  variance  to 
obtain  different  SNR.  The  ROC  is  obtained  based  on  SNR=-10dB,  where  Pd  vs  SNR  is 
obtained  for  false  alarm  rate  10“3,  the  same  parameters  as  in  [26],  [28],  Also,  the  number 
of  samples  is  Ns  = N/4.  The  solid  line  and  ‘x’  line  correspond  to  the  conventional  GLRT 
and  proposed  methods  respectively  in  each  example. 

Fig.  3.4  shows  the  situation  where  the  signal  consists  of  three  exponentially  decay- 
ing sinusoids  with  differing  onsets  and  different  modulated  frequency.  Fig.  3.5  shows  the 
case  with  a chirp  signal.  Finally,  Fig.  3.6  shows  the  results  using  a speech  phoneme  signal 
from  the  TIMIT  database.  We  observe  that  results  obtained  by  our  proposed  method  in  all 
examples  are  much  better  than  those  by  conventional  GLRT. 
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Damped  sinusoid  waveform 


ROC 


Pd  vs  SNR 


Figure  3.4:  The  example  of  three  damped  sinusoids:  proposed  method  (x)  and  GLRT 
(solid  line),  (a)  damped  sinusoid  Waveform;  (b)  ROC  (SNR  = -lOdB);  (c)  Pd  vs.  SNR 

( Pf  = io-3). 
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Chirp  waveform 


ROC 


Pd  vs  SNR 


Figure  3.5:  The  example  of  Chirp: proposed  method  (x)  and  GLRT(solid  line)  (a)  Chirp 
Waveform;  (b)  ROC  (SNR  = -lOdB);  (c)  Pd  vs.  SNR  (Pf  = 1(T3). 
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Speech  waveform  (er) 


Pd  vs  SNR 


Figure  3.6:  The  example  of  real  speech  signal:  proposed  method  (x)  and  GLRT(solid  line); 
(a)  A speech  Waveform;  (b)  ROC  (SNR  = -lOdB);  (c)  Pd  vs.  SNR  (Pf  = 10~3). 
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3.2.4  Summary  for  Detection  Algorithms 

We  introduced  a new  nonparametric  detection  algorithm  based  on  the  ratio  spec- 
trum. We  showed  this  algorithm  achieves  improved  detection  performance  by  nonuni- 
formly  sampling  the  frequency  axis.  In  addition,  we  expect  that  the  algorithm  to  be  robust 
since  the  assumptions  made  for  the  algorithm  are  easily  satisfied  by  many  kinds  of  sig- 
nals, particularly  for  transient  and  speech  signal.  Simulation  results  based  on  widely  used 
test  signals  and  real  speech  phonemes  show  better  performance  than  those  of  conventional 
GLRT. 

In  addition,  we  proposed  another  detection  algorithm  based  on  wavelet  threshold- 
ing. Analysis  and  simulation  show  that  the  proposed  wavelet  thresholding  is  suitable  for 
piecewise  smooth  signal  but  not  for  periodic  signals. 

3.3  Estimation  of  Parameters 

In  this  section,  we  first  introduce  general  methods  for  the  estimation  of  parameters. 
Using  the  advantage  of  the  consistent  estimate  of  the  ratio  spectrum  and  the  integrated 
spectrum,  we  then  propose  a new  method  for  parameter  estimation.  Also,  we  show  the 
performance  in  parameter  estimation  for  the  AR  model.  Finally,  we  compare  the  proposed 
method  with  widely  used  methods. 

3.3.1  General  Methods  for  Parameter  Estimation 

Consider  the  following  estimation  problem:  we  are  given  data  x(i)  fori  = 1,2, 
which  are  the  realized  value  of  N random  variables  Xlt  X2, ...,  XN.  The  joint  proba- 
bility distribution  of  Xlt  X2, ...,  XN  has  a known  mathematical  form,  with  joint  proba- 
bility density  function,  f(x i,  X2,  • • ■ , xjv;  $i,  &2,  •,  9P),  but  the  values  of  the  p parameters 
9ii82,  - ■ ■ ,9P  are  unknown.  We  wish  to  obtain  some  information  about  the  values  of 
9i,62,  ,9P  from  the  observed  data  X1:X2, XN. 

The  general  methods  of  estimation  include  maximum  likelihood  and  least  squares. 
The  widely  used  performance  evaluation  for  estimation  is  the  mean  and  variance  of  the 
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estimated  parameters.  If  the  means  of  the  estimate  is  equal  to  the  true  value  of  a parameter, 
this  is  referred  to  as  an  unbiased  estimate.  Also,  we  hope  the  variance  of  parameter  estimate 
is  as  small  as  possible. 

The  most  general  and  powerful  method  of  estimation  is  that  known  as  maximum 
likelihood.  This  method  produces  almost  all  the  well  known  estimates,  and  more  impor- 
tantly, may  be  applied  (in  principle)  to  any  type  of  estimation  problem  provided  only  that 
we  can  write  down  the  joint  probability  distribution  of  the  observations.  If  we  denote 
the  complete  set  of  observations  and  parameters  by  the  vector  X = (Xu  X2, XN)  and 
© = (0i,  02,  ■ ■ ■ , 0P)  respectively,  then  the  joint  probability  density  function  of  X may  be 
written  as  f(X/Q).  Consider  f(X/Q)  as  a function  of  © (with  x fixed  at  the  observed 
values).  If  we  can  find  0 to  make 


max/(X/0)  (3.28) 

Then,  we  call  the  estimation  of  parameter  © as  the  maximum  likelihood  estimate.  In 
practice,  we  obtain  the  0 by  solving  the  following 

df(X/Q) 

= 0 (3.29) 

Another  widely  used  parameter  estimation  method  is  least  squares.  Unlike  ML,  the 
method  of  least  squares  applies  only  to  the  case: 


V = O1X1  + 92x2  + • • • + 6qxq  + e (3.30) 

where  x\,  x2, ...,  xq  are  known,  and  0j,  d2l  ■ • • , 6P  are  unknown,  and  E(t)  — 0.  The  random 
variable,  e,  is  usually  regarded  as  the  measurement  error,  and  then  this  leads  to  a linear 
model. 

Let  yi  denote  the  value  y observed  in  the  zth  experiment  and  m,  x2i,  ■ ■ • , xqi  the 
values  of  xi,x2,  - ■ ■ ,xq  chosen  for  the  ith  experiment.  Then,  we  may  write, 


J/j  — 0\Xi i + 02X2i  H h dqXqi  + Cj, 


i = 1, 2,  • • • ,n 


(3.31) 
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where  E(t)  = 0,  all  i,  and  e,  represents  the  error  of  measurement  in  the  zth  experiment. 
The  problem  now  is  to  estimate  the  unknown  values  of  6\,  92,  ■ ■ ■ , 9P,  on  the  basis  of  the 
observed  values  yu  and  the  known  values  xu,  x2i,  ■ ■ ■ , xqi,  i = 1, 2,  • • • , n.  The  principle 
of  least  square  is  to  choose  estimates  of  81, 82,  • • • , 9P,  which  minimize  the  quantity, 

n 

Q{9\ , 92j  ■■■,9q)  = ~ 8\xu  - 92x2i 9qxqif  (3.32) 

i=l 

Then  differentiating  Q partially  with  respect  to  each  dj  gives, 


(3.33) 


We  obtain  the  estimate  of  parameters  by  solving  the  above  set  of  equations. 

3.3,2  Parameter  Estimation  in  the  Frequency  and  the  RS  Domain 

We  have  introduced  two  general  methods  for  parameter  estimation,  which  process 
the  data  in  the  time  domain.  They  both  are  widely  used  methods  since  the  data  is  usually 
given  in  the  time  domain.  However,  there  are  disadvantages  for  each  of  them.  For  ML, 
there  are  two  problems.  First,  it  is  not  easy  to  find  the  PDF  f{X/Q).  Second,  Eq.  3.29 
may  be  nonlinear,  thus  is  not  guaranteed  to  be  solved.  For  least  squares,  the  performance 
of  the  estimation  may  not  be  good  due  to  its  simple  model.  In  addition,  in  many  applica- 
tions, the  processing  in  the  transform  domain,  such  as  frequency  domain,  is  more  efficient 
than  that  in  the  time  domain.  As  an  example,  two  sinusoids  signals  can  be  separated  in  the 
frequency  domain  while  they  can  not  be  separated  in  the  time  domain.  In  the  last  chapter, 
we  have  discussed  the  integrated  spectrum  and  the  ratio  spectrum  and  have  shown  that  the 
estimates  of  the  ratio  spectrum  and  the  integrated  are  consistent.  Motivated  by  the  consis- 
tent feature  of  the  ratio  spectrum  and  the  integrated  spectrum,  we  discuss  the  application 
in  the  estimation  of  parameters  based  on  the  RS. 

Given  a set  of  data  x\,  x2,  ■ ■ ■ , x yv,  which  is  related  to  the  parameter  9 to  be  esti- 
mated, let  P t(u),  Ix(u ) and  Rx (u)  denote  the  power  spectrum,  the  integrated  spectrum 
and  the  ratio  spectrum  respectively.  Obviously,  X,  Px,  Ix  and  Rx  are  the  function  of  the 
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unknown  parameters  9.  We  can  estimate  the  9 on  the  basis  of  the  Ix  and  Rx.  First,  let  us 
consider  the  estimate  of  the  parameters  of  AR  noise  given  by 

x(n  + 1)  = ax(n)  + w(n),  n=l,2,---,N  (3.34) 

where  w{n)  is  Gaussian  white  noise  with  mean  0 and  variance  a2,  and  a is  an  unknown 
parameter.  Its  power  spectrum  is  given  by 

k = l,2,--,N  (3.35) 

and  the  integrated  spectrum  and  the  ratio  spectrum  are  given  by 

Uk)  = EiU  — a cos  k = l,2,--.,N  (3.36) 

and 

R*(k)  = -t$-y  k = 1,2,  ■ • ■ , N (3.37) 

The  above  three  equations  show  the  relationship  between  the  parameters  a2,  a and 
the  spectrum  Px,  the  integrated  spectrum  Ix  and  ratio  spectrum  Rx.  We  can  obtain  the  two 
parameters  by  solving  the  equation  if  we  have  two  observed  data  points,  such  as  Px(ki) 
and  Px{k2).  However,  in  practice,  we  can  only  get  the  estimates  for  the  spectrum  Px(k), 
the  integrated  spectrum  tx(k)  and  the  ratio  spectrum  Rx{k)  by  using  a set  of  observation 
x(k).  As  a result,  the  estimates  for  the  two  parameters  are  random  variables  rather  than 
deterministic  ones.  As  an  example,  Fig.  3.7  shows  the  spectrum  and  the  ratio  spectrum 
of  an  AR  signal.  In  the  figure,  the  spectrum  looks  noisy  or  very  random,  while  the  ratio 
spectrum  looks  smooth.  From  the  discussion  in  the  last  chapter,  the  integrated  spectrum 
and  the  ratio  spectrum  are  consistent  estimators  and  their  variances  are  zero  if  N is  infinite. 

Thus,  we  can  use  two  samples,  R(k1)  and  R(k2)  to  estimate  the  parameters  a and 
a2  by  solving  two  equations.  The  variance  of  the  estimates  for  a and  a2  are  very  small, 
even  zero,  if  N is  infinite.  In  this  way,  we  use  the  integrated  spectrum  and  the  RS  of  a 
signal  to  estimate  the  parameters  rather  than  its  spectrum. 
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The  analytical  spectrum  of  AR  noise 


(a) 

The  analytical  RS  of  AR  noise 


(b) 


(c) 


(d) 


Figure  3.7:  The  spectrum  and  the  ratio  spectrum  of  an  AR  noise  ( a = 0.6,  a2e  = 2).  (a)  The 
analytical  spectrum  of  an  AR;  (b)  The  spectrum  of  an  AR  (one  trial);  (c)  The  analytical 
ratio  spectrum;  (d)  The  ratio  spectrum(one  trial). 
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However,  to  implement  the  idea,  two  problems  have  to  be  considered.  First,  how  to 
choose  k to  obtain  the  minimum  variance  estimate?  In  other  words,  how  to  sample  the  RS. 
Secondly,  what  is  the  analytical  performance?  We  next  discuss  these  two  problems.  The 
Ix(k)  can  be  written  as 

k 2 

ix{k)  = V 

•“  1 — 2a  cos  ^z  + a2 

1=1  iv 

N_  

27T  " 1 - 2a  cos  + a2^ 

N [*2n  a2 

~ ir-  / z r —2dx  (3.38) 

27t  Jq  1 — 2a  cos  x + a2 

Using  standard  integration  tables,  we  have 


1 


b + c cos  ax 


dx  = 


ay/b 2 — c2 


arctan 


cz  sin  ax 


c + b cos  ax 


(3.39) 


then, 


and 


W) 


2 N 1 

o — arctan 

27T 1 — a2 

2N  1 

cr  — arctan 

2n  1 — a2 

a2f(a,k) 


(1  — a2)sinx 
—2a  + (1  + a2)  sin  a;  0 
(1  — a2)  sin  ^27t 
—2a  + (1  + a2)  sin  ^2 7r 


(3.40) 


Rx(k)  = 


/(Q;  k) 

f(a,N) 


= s(a , k ) 


(3.41) 


2 _ Ix{k) 

f(a,k ) 


(3.42) 


The  estimate  of  a is  the  random  variable  since  the  observed  data  Rx(k)  is  random. 


We  obtain  a by  using  Eq.  3.41.  After  obtaining  the  a,  we  have 


~2 dx{k2j 

f(a,k) 


(3.43) 


To  evaluate  the  performance  of  estimate  a,  we  need  to  derive  an  expression  for  a,  which  is 
not  straightforward  since  the  relationship  is  very  complicated.  However,  an  approximation 
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can  be  made  due  to  the  small  variance  of  Rx(k) 


ARx(k ) « sa(a,k)Aa 


(3.44) 


then, 


(3.45) 


The  variance  of  a,  var(a),  is  given  by 


var(Rx(k , a)) 


(3.46) 


In  Eq.  3.46,  var(Rx(k ))  can  be  computed  by  using  Eq.  2.50,  and  the  denominator 


can  be  computed  by  using  Eq.  3.41 . Therefore,  we  can  find  k such  that  the  variance  of  a is 
minimum.  However,  it  is  hard  to  further  simplify  the  equation,  so  numerical  computation 
is  necessary  to  find  the  solution. 

To  verify  the  analytical  result  given  by  Eq.  3.46,  a simulation  result  is  shown  in 
Fig.  3.8,  where  an  AR(1)  signal  (a2  = 2)  is  the  input  with  a = 0.2, 0.4,  0.6, 0.8  respec- 
tively. For  each  value  of  a given,  we  change  k from  1 to  N,  and  compute  the  variance 
of  the  estimate  a.  The  analytical  variance  of  a is  computed  using  Eq.  3.46,  and  the  sim- 
ulated variance  is  obtained  by  averaging  1000  trials  of  the  AR(1).  From  this  figure,  we 
can  see  that  there  is  a minimum  variance  at  an  unknown  k for  each  parameter  a.  Also,  the 
simulation  result  is  very  close  to  the  analytical  predictions. 

If  we  have  some  knowledge  on  a to  be  estimated,  we  can  achieve  the  minimum 
variance  of  the  parameter  estimate.  However,  if  we  do  not  have  any  knowledge  of  a,  we 
take  k = A/4  or  u = 7t/2  in  other  simulations.  In  the  second  simulation,  the  input  signal 
used  is  an  AR(1)  with  a2  = 1.  The  parameter  a is  changed  from  0.1  to  0.9  with  0.1 
increments.  We  first  apply  our  algorithm  to  estimate  the  parameters  a and  a2,  and  compute 
the  variances  of  these  estimates.  The  variance  is  obtained  by  averaging  1000  trials  as 


60 


(a)  (b) 


(c) 


(d) 


Figure  3.8:  The  variance  of  an  AR  noise  (analytical  and  simulated)  (a2  = 2).  (a)  a = 0 2- 
(b)  a = 0.4;  (c)  a = 0.6;  (d)  a = 0.8. 
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before.  In  addition,  the  LPC  algorithm  is  used  to  estimate  these  parameters  to  make  a 
comparison.  The  simulation  results  are  shown  in  Fig.  3.9.  From  the  figure,  we  observe 
that  the  performance  using  the  RS  is  slightly  worse  than  that  using  the  widely-used  LPC, 
although  both  are  inversely  proportional  to  the  data  N. 

It  is  well  known  that  the  AR  signal  is  a time  domain  signal,  and  the  famous  Yule- 
Walker  (YW)  or  LPC  algorithm  can  generally  be  used  to  estimate  the  parameters.  Dif- 
ferently, LPC  is  a method  the  in  time  domain,  while  the  RS  is  a method  in  the  frequency 
domain.  Obviously,  there  are  inherent  advantages  for  frequency  domain  signal  processing, 
such  as  easily  filtering  different  frequency  components.  In  some  situations,  the  model  of 
a problem  is  inaccurate  or  mismatches  exist.  An  example  is  the  case  that  the  input  con- 
sists of  an  AR  signal  plus  impulsive  noise.  We  apply  both  our  algorithm  and  LPC,  and 
the  simulation  results  show  that  both  algorithm  fail  to  produce  good  parameter  estimates. 
In  another  example  where  the  input  signal  consists  of  an  AR  signal  plus  a sinusoid  rather 
than  a pure  AR,  we  can  show  that  LPC’s  performance  is  poor  while  the  RS  produces  better 
results.  Fig.  3.10  show  the  simulation  results  when  an  input  is  an  AR  signal  plus  a sinu- 
soid. Filtering  is  used  before  using  both  algorithms  in  the  simulation.  From  this  figure,  we 
can  see  that  the  means  of  a and  a2  using  LPC  are  very  bad,  and  RS  produces  much  better 
results  than  LPC. 

In  summary,  parameter  estimation  using  the  RS  may  be  considered  as  processing  in 
frequency  domain,  and  may  be  efficient  for  the  situations  where  the  frequency  analysis  is 
necessary  or  helpful.  Also,  there  are  some  cases  such  as  j noise  where  it  is  hard  to  describe 
them  in  time  domain.  In  those  cases,  the  RS  may  produce  better  performance  than  other 


methods. 
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Figure  3.9:  The  variance  and  mean  of  the  parameter  estimate  of  an  AR  noise  signal  (RS 
and  LPC).  (a)  The  mean  of  estimate  a;  (b)  The  mean  of  estimate  erf;  (c)  The  variance  of 
estimate  a;  (d)  The  variance  of  estimate  of. 
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Figure  3.10:  The  variance  and  mean  of  the  parameter  estimate  of  an  AR  noise  signal  plus 
a sinusoid  (RS  and  YW).  (a)  The  mean  of  estimate  a;  (b)  The  mean  of  estimate  cr^;  (c)  The 
variance  of  estimate  a;  (d)  The  variance  of  estimate  cr|. 


CHAPTER  4 

ANALOG  IMPLEMENTATION  WITH  ADAPTIVE  LPF  AND  CONVERGENCE  ANALYSIS 


4. 1 Introduction 


In  this  chapter,  we  first  talk  about  the  analog  ratio  spectrum,  and  discuss  the  window 
effect  due  to  the  nonideal  LPF  used.  Then,  we  introduce  the  architecture  with  adaptive  LPF 
in  the  analog  implementation  of  the  RS.  In  addition,  for  the  nonlinear  system,  we  derive 
the  approximate  expression  of  convergence  time  and  steady  state  error,  which  are  verified 
by  the  simulations 


4,2,1  Analog  Ratio  Spectrum 

The  analog  ratio  spectrum  R(wc)  is  formed  by  computing  the  ratio  of  the  power  of 
a low-pass  filtered  signal  to  the  power  of  the  original  un-filtered  signal  for  all  filter  cutoff 
frequencies.  It  can  be  defined  as 


where  X(uj)  is  the  Fourier  transform  of  a signal  and  H(uj,uc)  is  the  transfer  function  of  a 
low-pass  filter  with  a cutoff  frequency  uc. 

Since  the  transform  is  invariant  to  a constant  scale  factor,  we  can  assume  without 
loss  of  generality  that  input  power  equals  one  and  simplify  Eq.  4.1  to 


4.2  Analog  Ratio  Spectrum  and  Window  Effect 


(4.1) 


(4.2) 


Applying  a derivative  with  respect  to  luc  to  both  sides  of  this  equation  gives 
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'OO 


\X(u>)\2B(uj,ujc)duj 


(4.3) 


o 


where  B(u,uc)  = £-JH(uj,ujc)\2. 

We  first  study  the  case  where  H(uj,  uc)  is  an  ideal  low-pass  filter  expressed  as 

\H(u},uc)\2  = U(u>  + uc)  — U(ui  — coc)  i 


(4.4) 


where  U is  a unit  step  function.  Then  Eq.  4.3  becomes 


(4.5) 


In  other  words,  the  derivative  of  the  ratio  spectrum  is  exactly  the  power  spectrum  when  an 

ideal  low-pass  filter  is  used  in  the  reconstruction. 

4.2.2  Spectrum  Smoothed  by  a Window  with  Constant  O 

In  the  real  world,  it  is  impossible  to  realize  an  ideal  low-pass  filter  as  described 
in  Eq.  4.4.  In  fact,  an  ideal  low-pass  filter  is  not  needed  in  most  applications  of  signal 
processing.  There  are  many  realizable  low-pass  filters  available.  Let  us  first  consider  the 
Butterworth  filter  (maximally  flat),  which  has  an  implicit  expression.  We  will  see  that  the 
use  of  a Butterworth  the  low-pass  filter  results  in  an  appropriate  window  function  for  many 
applications. 

We  define  H(uj,  u>c)  as  an  Nth-order  low-pass  filter  with  the  following  transfer  func- 
tion 


where  u>c  is  the  half  power  (or  3-dB)  frequency  of  the  filter.  It  can  be  easily  verified  that 
Rn^c)  is  a monotonic  nondecreasing  with  respect  to  ujc  and  is  bounded  between  zero  and 
one  when  the  Butterworth  LPF  is  used.  Then,  Eq.  4.3  becomes 


\H[u,uc)\2 


1 


x(uc)  = 


CJ- 


,2  N 
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\X  (lu)\2  B (lu  , luc)cLu 


(4.7) 


where 


LU 


2N 


B(lu,  luc)  - 


2 N 


ou 


2 N 


/ (j 

1 + 


2JV  \ 2 


2iV 


(4.8) 


We  can  show  that  B(lu,luc)  results  in  constant  Q filters  which  are  very  popular  in  speech 
signal  processing.  Eq.  4.8  can  be  rewritten  as 


B(u,  luc)  = 


2 N 


LUr 


(o?)"  + (£)")' 

It  can  be  shown  that  the  3dB  A lu  bandwidth  of  Bm(uj,  luc ) is 


A LU  Ri  2 — 
N 


(4.9) 


(4.10) 


and  that  B(lu,  luc)  reaches  its  maximum  value  at  lu  = luc  as 

B(luc1luc)  = — (4.11) 

ZUJC 

Therefore,  B(uj,luc ) is  approximately  a symmetric  function  around  luc  and  the  in- 
tegral of  B(lu,  luc)  with  respect  to  lu  is  1.  Using  a Taylor  series  expansion,  expanding  B at 
lu  = luc,  and  ignoring  the  terms  with  orders  larger  than  2,  we  can  approximate  B(lu,  luc ) as 
a Gaussian  function: 


B(uj,luc)  = B(ujc,ujc)  + —(B(lu,luc))  ■ (lu  - luc) 
+ 0.5-^—(B(lu,ujc))  ■ (lu  — luc)2  + • ■ • 


t Uc.  4 


UJr 


? + ■■■) 


1 


(CJ-LUC)2 

, — -exp  ( — ) 

\/27rcr  2tj2 


(4.12) 


where  a = 0.8^. 

In  addition,  Q = ^ = y,  a constant.  Therefore,  the  X(cjc)  can  be  viewed  as  the 
signal  spectrum  smoothed  by  a window  function  B(lu,  luc ) whose  bandwidth  and  central 
frequency  can  be  changed  with  constant  Q. 
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In  spectrum  estimation,  windows  are  often  used  to  smooth  a signal  spectrum,  where 
the  selection  of  window  type  and  size  [23]  is  dependent  on  the  application.  Many  of 
optimal  window  shapes  such  as  minimum  bias  spectral  criteria  tend  to  have  Gaussian-like 
shapes.  The  variance  of  spectrum  estimation  is  related  to  the  length  of  the  window  used. 
In  general,  the  longer  the  window  length,  the  smaller  the  variance.  Also,  windows  with 
varying  length  that  adapt  to  the  data  have  also  been  developed  such  as  the  Capon  method 

[19]. 

Constant  Q filters  in  the  frequency  domain  are  an  important  component  for  many 
applications  in  signal  processing.  In  speech  processing,  constant  Q filters  with  Q equal  to 
4 — 10  are  popular  [35],  In  addition,  constant  Q is  also  the  basic  feature  of  wavelets  that 
preserve  a constant  Q in  the  Fourier  domain  when  the  scale  is  changed  [36], 

To  gain  more  insight  to  the  window  shape  given  in  Eq.  4.9,  the  waveforms  of  the 
window  obtained  by  using  Eq.  4.9  are  shown  in  Fig.  4.1.  Fig.  4.1(a)  shows  the  case 
when  N = 10,  fc  = 1000Hz,  2000Hz,  and  4000Hz;  Fig.  4.2(a)  shows  the  case  when 
fc  — 2000Hz,  N = 2,  4 ,10.  We  can  see  that  the  bandwidth  of  B(uj,uc)  is  proportional 
to  fc  and  inversely  proportional  to  the  order  N of  the  low-pass  filter.  To  evaluate  how 
accurate  Eq.  4.12  is,  the  comparison  between  the  true  waveform  of  the  window  function 
and  its  Gaussian  approximation  is  shown  in  Fig.4.2,  where  the  solid  line  corresponds  to 
the  Gaussian  model  in  Eq.  4.12  and  the  dotted  line  corresponds  to  the  true  waveform  in 
Eq.  4.9.  We  can  see  that  the  waveform  of  B(lj)  is  very  close  to  our  Gaussian  model. 

To  see  the  window  effect  for  the  reconstructed  power  spectrum  discussed  above, 
we  conduct  two  simulations.  In  the  first  simulation,  the  Gaussian  white  noise  signal  with 
= 1 is  used  as  the  input  x(t).  The  simulation  results  are  shown  in  Fig.  4.3.  The  power 
spectrum  of  white  noise  (one  realization)  is  shown  in  Fig.  4.3(a).  Fig.  4.3(b)  shows  the 
reconstructed  power  spectrum,  which  is  obtained  by  differentiating  the  integrated  power 
spectrum  given  by  Eq.  4.2.  We  can  see  that  the  reconstructed  power  spectrum  shown  in 
Fig.  4.3(b)  is  much  smoother  than  that  shown  in  Fig.  4.3(a).  Also,  the  mean  and  variance  of 
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Figure  4.1:  The  waveforms  of  the  window  functions,  (a)  B(f)  when  TV  = 10  with  fc  = 
1000  (solid  line),  2000  (dashed  line)  and  4000  (dashdot)  respectively;  (b)  B(f ) when  fc  = 
2000  with  N — 2 (solid),  5 (dashed)  and  10  (dashdot)  respectively. 


Figure  4.2:  Comparison  between  Gaussian  model  (solid)  and  the  true  waveform  (dashdot) 
of  window  function  with  TV  = 10. 
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the  reconstructed  power  spectrum  based  on  averaging  500  trials  are  shown  in  Fig.  4.3(c)and 
(d)  respectively.  We  observe  that  the  mean  is  close  to  the  analytical  mean  1,  while  the 
variance  of  the  power  spectrum  is  inversely  proportional  to  the  frequency.  In  the  Eq.  4.12, 
the  width  of  the  window  is  proportional  to  the  frequency.  It  is  well  known  that  the  variance 
is  inversely  proportional  to  the  width  of  the  window.  As  a result,  the  variance  is  inversely 
proportional  to  the  frequency,  which  agrees  with  the  simulation  results.  Another  simulation 
result  for  the  case  that  the  input  is  AR(1)  noise  with  a2  = 1 and  a = 0.5  is  shown  in 
Fig.  4.4,  and  the  observation  similar  to  white  noise  case  has  been  obtained.  From  the  two 
simulation,  we  can  see  that  the  function  of  Gaussian  window  due  to  Butterworth  LPF  is 
to  reduce  the  variance  of  reconstructed  power  spectrum.  Also,  the  variance  is  inversely 
proportional  to  the  frequency  since  the  width  of  the  Gaussian  window  is  proportional  to 
the  frequency. 

We  conclude  that  X (u>c)  shown  in  Eq.  4.3  (the  derivative  of  the  ratio  spectrum)  is 
the  spectrum  smoothed  by  a constant  Q band-pass  filter  with  Gaussian  shape  if  an  N-th 
order  Butterworth  low  pass  filter  is  used.  This  operation  leads  to  a smaller  variance  in 
spectral  estimation  than  that  without  a window  and  is  very  close  to  the  optimum  window 
function  discussed  in  [23],  Also,  the  width  of  the  window  is  proportional  to  the  frequency. 
As  a result,  the  amount  of  reduction  of  variance  is  proportional  to  the  frequency.  In  ad- 
dition, the  ratio  spectrum  shown  in  Eq.  4.1  is  the  integral  of  X(u>c)  with  normalization, 
and  is  monotonic  nondecreasing  with  respect  to  luc.  These  nice  features  of  the  RS  such 
as  normalization,  nondecreasing  and  reduced  spectrum  variance  make  it  very  useful  in 
many  applications.  Other  features  might  be  obtained  if  H(lo,ujc ) is  in  a form  rather  than 
Butterworth. 

4.3  Analog  Implementation  with  Adaptive  LPF 

Basically,  there  are  two  ways  to  compute  the  ratio  spectrum  shown  in  Eq.  4. 1 . We 
can  either  sample  along  the  frequency  or  along  the  ratio  axis.  In  the  first  approach,  the 
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Figure  4.3:  The  power  spectrum  of  white  noise  (a2  = 1)  with  the  window  function  (But- 
terworth  LPF,  N = 10).  (a).  The  power  spectrum  (one  realization);  (b).  The  power 
spectrum  reconstructed  from  the  windowed  integrated  spectrum;  (c).  The  mean  of  recon- 
structed power  spectrum  (500  trials);  (d).  The  variance  of  reconstructed  power  spectrum 
(500  trials). 
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Figure  4.4:  The  power  spectrum  of  AR  ( a = 0.5,  a2  = 1)  with  the  window  function 
(Butterworth  LPF,  N = 10).  (a).  The  power  spectrum  (one  realization);  (b).  The  power 
spectrum  reconstructed  from  the  windowed  integrated  spectrum;  (c).  The  mean  of  recon- 
structed power  spectrum  (500  trials);  (d).  The  variance  of  reconstructed  power  spectrum 
(500  trials) 
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signal  is  applied  to  a low-pass  filter  with  cutoff  frequency  luc  specified,  then  the  output 
average  power  is  computed.  Next,  the  power  of  the  un-filtered  signal  is  computed.  Finally, 
we  get  the  ratio.  The  operation  needs  to  be  repeated  for  many  predetermined  values  of  ujc 
In  the  second  approach,  the  architecture  is  shown  in  Fig.  4.5.  S(t ) is  the  input 
signal,  Ps  is  the  power  of  input  signal,  and  H(u,  ujc ) is  the  transform  function  of  a LPF 
with  variable  cutoff  frequency.  The  ratio  R (0  < R < 1)  represents  the  power  attenuation. 
e(t)  is  an  error  signal  (e(t)  = Pd  - Pf),  which  is  used  to  adjust  the  cutoff  frequency  of  the 
LPF.  Once  R is  set,  the  cutoff  frequency  is  obtained  adaptively.  When  the  attenuated  power 
of  the  signal  ( Pd  = RPS ) is  different  from  the  power  of  the  filtered  signal  (Pf),  the  error 
signal  e(t)  is  applied  to  an  integrator  to  adjust  the  g (transconductance)  of  the  LPF  and 
thus  change  the  cut  off  frequency  uc  until  the  error  is  zero,  i.e.  Pf  = RPS  or  R = Pf/Ps 
which  is  equivalent  to  Eq.  4.1.  We  obtain  different  ljc  values  by  setting  different  R ratios. 
In  this  way,  we  obtain  R(tuc).  In  most  applications  such  as  speech  recognition  and  coding, 
only  a few  features  R(uCj)  and  tocl  are  needed,  and  the  complete  R(ujc)  is  not  necessary. 
Therefore,  this  architecture  is  more  efficient  than  the  previous  one. 

As  a special  and  simplest  case,  if  the  input  is  a sinusoid  with  frequency  Ui  and 
we  set  R = 0.5,  then  the  cutoff  frequency  u>c  of  the  LPF  converges  to  the  input  sinusoid 
frequency  adaptively,  i.e.  uc  = Ui.  This  technique  can  be  used  in  frequency  acquisition 
which  is  required  in  the  acquisition  of  most  phase  locked  loops  [37], 

4.4  Convergence  Time  and  the  Steady  State  Error  Analysis 

LMS  is  the  most  widely  used  adaptive  signal  processing  algorithm  [38].  Since  the 
MSE  is  a convex  function,  a gradient-based  adaption  scheme  can  be  used  without  fear  of 
convergence  to  a local  minimum.  However,  the  classical  LMS  or  steepest  gradient  descent 
may  not  be  appropriate  in  our  case  since  the  gradient  of  the  error  can  be  small  when  the 
error  is  large  and  the  gradient  can  be  large  when  the  error  is  small.  A simple  example  is 
that  the  input  signal  is  a sinusoid.  Instead,  the  adaptive  algorithm  is  the  integration  of  the 
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Figure  4.5:  Functional  block  diagram  for  the  RS  circuit. 


error  as  is  done  with  phase  locked  loops,  resulting  in  a robust  algorithm  that  obtains  the 
cutoff  frequency  uc  adaptively  when  a ratio  R is  given. 

The  fundament  issues  for  all  adaptive  systems  include  the  convergence  time,  steady 
state  error,  convergence  conditions,  and  the  selection  of  the  feedback  coefficient  or  step 
size.  Real  time  processing  applications  require  fast  convergence,  and  if  the  convergence 
time  is  too  long,  these  applications  become  difficult  or  impossible.  Due  to  the  nonlinear 
nature  of  the  RS  computation,  it  is  very  difficult  to  derive  the  exact  convergence  time 
and  steady  state  error.  The  phase  lock  loop  [37],  [39],  [40]  is  a nonlinear  circuit  whose 
convergence  time  has  been  well  studied  and  we  can  get  some  useful  insight  into  the  RS  by 
employing  similar  linearized  analysis  techniques.  We  next  discuss  the  factors  related  to  the 
convergence  time  and  steady  state  error. 

In  the  block  diagram  shown  in  Fig.  4.5,  we  assume  that  the  impulse  responses  of 
the  Nth-order  Butterworth  low-pass  filter  (in  our  design,  10th  order)  and  the  first-order 
LPF  for  power  computation  are  hN(t,wc)  and  hi(t)  respectively,  where  luc  is  the  cutoff 
frequency  of  Nth-order  Butterworth  LPF.  The  output  ps(t)  of  the  power  computation  is  the 
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square  of  input  signal  s(t)  convolved  with  hi(t),  then, 

Psit)  = s2(t)  ® hi (t)  (4.13) 


Thus 


pd{t)  = RPS  = Rs2(t)  ® hi(t) 


(4.14) 


where  pd(t)  is  the  attenuated  power.  Also,  the  output  of  the  Nth-order  Butterworth  filter  is 
given  by 

s0(t)  = s(t)  ®hN(t,uc)  (4.15) 


thus 


Pf(t ) = sjj(*)®/ii(t) 

= (s{t)  ® hN(t,Ldc))2  (g)  hi(t)  (4.16) 


Therefore,  the  cutoff  frequency  of  the  Nth  order  Butterworth  LPF  is  given  by  the 
solution  of  the  following  differential  equation 


= a(pd{t)  -pf(t)) 

= u(Rs2(t)®h1(t)-(s(t)®hm(t,ujc))2®h1(t))  (4.17) 

However,  from  the  above  relationship  given  by  Eq.  4. 1 7,  it  is  very  difficult  to  obtain 
the  solution  of  ujc  since  the  system  is  nonlinear.  Instead,  the  popular  piecewise  linear 

techniques  are  used  here  to  complete  the  analysis. 

4.4, 1 Continuous  Time  Domain  Analysis 


Let  R denote  the  ratio,  uj*c  the  expected  cutoff  frequency  of  the  LPF,  then  R(co*  ) = 
R,  and  H (u>,  tuc)  the  frequency  response  of  low  pass  filter,  and  Ps  the  power  of  the  input 
signal. 

Applying  a piecewise  linear  assumption  to  the  RS  curve,  we  assume  that  the  RS 
consists  of  several  pieces  and  each  piece  is  linear.  An  example  of  the  RS  is  shown  in 
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Figure  4.6:  a RS  spectrum. 


Fig.  4.6.  After  applying  this  assumption,  the  nonlinear  RS  is  converted  to  a piecewise 
linear  function. 

Assuming  that  the  cut-off  frequency  ujc{t)  is  close  to  lu*c  first,  and  that  transient 
effect  can  be  ignored,  then  it  is  reasonable  to  assume  that  the  system  is  linear.  We  can  find 
similar  assumptions  in  the  analysis  of  phase  lock  loops  [37],  [41],  With  this  assumption, 
the  error  can  be  written  as 


where  kp  is  the  slope  of  the  piece  of  curve  from  lu2  to  cu*  in  the  RS  figure.  Also,  the 


e(f)  « kp{*jjc(t)  - u* ) 


(4.18) 


(4.19) 


Then, 


(4.20) 
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where  a is  the  feedback  coefficient.  The  solution  of  the  above  linear  constant  coefficient 
differential  equation  is: 

uc{t)  = L0*c  + (wo  - u*c)e~0t  (4.21) 


where 

P = akp  (4.22) 

and  loo  = uc( 0)  is  the  initial  value  of  the  cutoff  frequency.  We  can  then  derive  the  conver- 
gence time  T needed  to  reach  u>c(T)  as 


T = 4 In 


u0  - u>: 


(4.23) 


P ujc(T)  - uj* 

If  the  input  consists  of  one  sinusoid  with  frequency  ujin  = uj*,  and  the  A'th-order  Butter- 
worth  LPF  is  used,  then  it  is  easy  to  show 

N 


P = akp  = aPs 


2 io* 


(4.24) 


Next,  we  consider  the  case  that  initial  cuc(0)  is  far  from  co*  shown  in  Fig.  4.6.  In 
the  figure,  ui  and  cu2  are  the  frequencies  where  the  curve  bends.  Let  T3  be  the  time  taken 
from  uj0  to  lui  and  /?2  the  slope;  T2  the  time  taken  from  to  oj2;  and  T\  the  time  from  ui  to 
LJ*  and  Pi  slope  in  this  piece.  Then  we  can  compute  the  convergence  time  piece  by  piece 
using  a piecewise  linear  technique. 

During  the  period  from  u0  to  u)i  (or  T3), 


e(t)  — P\{u2  — uj *)  + /32(cu(f)  — cui) 


(4.25) 


then 

r3  = -in|i  + ^(^2_^) 

Next,  during  the  period  from  u>i  to  cj2  (or  T2), 


(4.26) 


e(t)  = Pi{uji  - uj*) 


(4.27) 


then, 


1 Ll>2  — 

Pi  U>1  - UJ* 


(4.28) 
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Similarly,  the  time  taken  from  uq  to  lu*  (or  Tx)  is 


(4.29) 


Therefore,  the  total  time  T is: 

T = Tx  + T2  + T3 


(4.30) 


Similarly,  we  can  analyze  the  case  where  cuc  is  in  any  region  far  from  u*.  Using 
piecewise  linear  techniques,  the  total  time  is  the  sum  of  the  time  taken  in  each  piece.  It 
should  be  pointed  out  that  T3  < T2  < X)  since  the  error  during  T2  is  larger  than  the 
error  during  Tx  if  the  frequency  gap  and  the  slope  in  each  piece  are  same.  As  a result,  the 
dominant  component  of  the  total  convergence  time  occurs  when  ujc(t)  is  close  to  u>*  since 
the  error  is  smaller  during  this  period.  Therefore,  Eq.  4.23  can  be  used  as  the  estimation 
of  approximate  convergence  time  [42]  with  good  accuracy  in  many  situations.  The  system 
converges  as  long  as  a > 0.  In  practice,  the  selection  of  a is  a tradeoff  between  slow 
convergence  and  steady  state  error  (misadjustment). 

After  deriving  the  convergence  time,  we  next  analyze  the  steady  state  error  (misad- 
justment) of  the  cutoff  frequency  of  Nth-order  Butterworth  low  pass  filter.  The  steady  state 
error  is  caused  by  the  fact  that  the  error  e(t)  is  not  zero  even  if  the  mean  of  e(t)  is  zero. 
The  error  can  be  written  as, 


where  M ( t ) stands  for  the  unexpected  signal,  which  can  be  noise  or  other  oscillations  due 
to  the  nonideal  operations.  Then, 


e(t)  = kp(ujc(t ) - lo*c)  + M(t) 


(4.31) 


— a;c(t)  fa  - ae(t ) + aM(t ) 


= - akp(uc(t ) - cj*)  + aM(t ) 
= k(uc(t)  — u*)  + aM(t ) 


(4.32) 


where  k = akp.  Thus,  the  steady  state  response  (after  t — ► oo)  is, 


ue(t)  = v*  + kM(t)  ® ha{t) 


(4.33) 
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where 

M<)  = 

= e~kt  (4.34) 

Therefore,  the  steady  state  error  is 

Wc(t)  ~ uj*c  = kM(t)  ® ha(t)  (4.35) 

To  gain  more  insight  into  steady  state  error  given  in  the  above  equation,  we  now 
consider  the  case  where  the  unexpected  signal  M(t)  is  caused  by  the  power  computation. 
In  this  situation,  the  oscillation  occurs  due  to  the  first-order  low  pass  filter.  The  output 
of  the  power  computation  consists  of  DC  and  AC  components.  The  DC  term  is  what  we 
expect,  while  the  AC  term  is  unexpected  due  to  the  nonzero  cutoff  frequency  of  the  first- 
order  LPF.  If  the  input  can  be  viewed  as  the  combination  of  sinusoids,  and  we  assume 
that  there  is  one  dominant  sinusoid  with  amplitude  A and  frequency  u>0  after  the  square 
operation,  then 

M{t)  = s2(t)  <g>  hi(t)  — c 

~ A0  sin(tuo£  + (po ) (4.36) 

where  c is  the  DC  term,  and  hi  (t)  is  the  impulse  response  of  first-order  LPF  given  by 

hi(t)  = ujcle~“clt  (4.37) 

From  the  above  equation,  we  can  model  M(t)  approximately  as  a sinusoid  with  amplitude 
A0  and  frequency  cu0,  while  other  components  are  filtered  and  can  be  ignored.  Then, 

Ao  = A--  1 (4.38) 

A + e>2 

Consequently,  Eq.  4.33  becomes, 

wc(f)  = ljc  + kAo  sin(a>of  + <f>) 


y^o  + k2 


(4.39) 
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Therefore,  the  steady  state  error  is, 

ujc(t)  -uj*  = kA0  sin(w0t  + 4>)  ..  } = (4.40) 

V <^o  T-  k2 

From  the  above  derivation,  we  can  see  that  the  steady  state  error  is  proportional 
to  the  feedback  coefficient  a,  the  amplitude  of  the  dominant  sinusoid  A0  and  the  cutoff 
frequency  ucl  (if  ujc1  < uQ)  of  LPC  for  power  computation,  but  is  inversely  proportional 

to  the  frequency  of  the  dominant  sinusoid  lu0  ( k -C  cj0). 

4.4.2  Discrete  Time  Domain  Analysis 

In  the  digital  implementation,  an  iterative  algorithm  is  used  to  find  the  uc  given  the 
ratio  R.  The  idea  and  procedure  in  the  analysis  for  digital  implementation  are  similar  to 
those  for  analog  implementation.  Here,  we  briefly  introduce  the  difference. 

The  iterative  equation  can  be  written  as 

ujn+i  = u)n  - ae(t)  (4.41) 

where  e(t)  is  the  error.  In  our  analysis,  we  assume  that  the  ratio  spectrum  R(u>c)  is  piece- 
wise  linear  with  slope  rate  Sk  in  the  kth  linear  period  [ujk,ujk+1},  1 < k < N,  where  N is 
the  number  of  piecewise  linear  segments.  Assuming  the  lo0  is  close  to  the  final  value,  then 
the  above  equation  can  be  reduced  to: 

Un+i  ~ un-  2 cce(n) 

= un  - 2a(Pd(n)  - Pf(n))  (4.42) 

where  a is  the  step  size.  Taking  the  mean  of  both  sides,  we  have: 

E[u)n+i]  = E[u)n]  — 2a{E[Pd(n)}  — E[Pf(n))} 

= E[un]  - 2 aSk(u*  - E[un))  (4.43) 

where  Sk  («  PsN/u*)  is  the  slope  which  is  close  to  the  final  value.  Also, 


E[un+1]  - u,c*  = (1  - 2aSk)[E[un]  - u;c*] 


(4.44) 
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where  uj*  is  the  optimal  value.  Defining  an  error  variable  un  as  the  difference  between  the 
expected  value  of  the  adaptive  filter  cutoff  frequency  and  the  optimal,  i.e. 


Un  — Ulc 


(4.45) 


then, 


un+i  = (1  - 2 aSk)un  = (1  - 2aSk)nu0  (4.46) 

So,  if  |1  — 2aSk\  < 1,  i.e.  |a|  < T-,  un  converges  to  zero,  equivalently,  ujn  will  converge 
to  uj*.  when 


i i 1 

M < -Q- 

the  adaptive  filter  converges.  The  convergence  time  is 

1 


(4.47) 


Toe 


2 aSk 


(4.48) 


In  the  similar  ways  to  the  continuous-time  analysis,  we  can  also  examine  other  cases. 
4,4.3  Simulations 


To  evaluate  the  convergence  analysis  efficiently,  we  simulated  the  nonlinear  RS 
system  in  Matlab  since  we  can  isolate  the  error  sources  more  easily  than  actual  circuits.  In 
the  first  simulation,  we  want  to  verify  the  conjecture  that  the  dominant  time  is  taken  when 
the  error  is  small. 

In  the  simulation,  the  input  signal  consists  of  two  sinusoids  with  the  same  magni- 
tude (Ai  = A2  = 0.5)  and  different  frequencies  (fx  = 3000Hz  and  f2  = 4000Hz).  The 
simulation  results  are  shown  in  Fig.  4.7.  Fig.  4.7(a)  shows  the  ratio  spectrum  with  a tenth- 
order  LPF,  while  Fig.  4.7(b)  indicates  the  convergence  curve.  The  initial  cutoff  frequency 
is  2000Hz,  the  final  cutoff  frequency  is  expected  at  4000Hz  since  we  set  the  ratio  to  0.75. 
We  can  see  the  cutoff  frequency  converges  to  the  final  value.  The  time  taken  from  2000Hz 
to  3000Hz  is  much  smaller  than  that  from  3000Hz  to  4000Hz,  since  the  error  in  the  piece 
from  3000Hz  to  4000Hz  is  much  smaller  than  that  in  the  piece  from  2000Hz  to  3000Hz. 
As  a result,  the  simulation  result  agrees  with  our  analysis. 


Figure  4.7:  Convergence  curve  of  two  sinusoids  input  with  a lOth-order  Butterworth  LPF 
(a)  The  ratio  spectrum;  (b)  The  convergence  curve  ( R = 0.75). 
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Next,  we  examine  the  relation  between  convergence  time  T and  the  final  frequency. 
In  this  simulation,  the  input  consists  of  two  sinusoids  with  the  same  power  Psl  = Ps2  = 0.5 
and  /2  = 4/cHz,  j\  is  a variable.  R = 0.25  so  that  the  fc  (or  u>c)  is  expected  to  converge 
to  f ! . Also,  a = 0.125  * 10-3,  and  the  initial  cutoff  frequency  of  the  LPF  is  f0  = 0.75 fx. 
The  input  frequency  fi  is  varied  from  1.5/cHz  to  3/cHz.  The  convergence  time  is  defined 
as  the  time  from  initial  to  95%  of  the  final  value.  The  relation  between  convergence  time 
T and  the  input  sinusoid  frequency  is  shown  in  Fig.  4.8.  The  solid  line  represents  the 
linearized  analytical  value  given  by  Eq.  4.23  and  the  stars  represent  the  actual  simulated 
value.  Fig.  4.9  shows  the  relationship  between  convergence  time  with  the  input  power 
where  /i  = 3/cHz,  f0  = 0.75/i,  and  power  Psl  varies  from  0.5  to  1.  We  can  see  the 
simulation  results  are  close  to  the  analytical  results  of  Eq.  4.23,  with  about  10%  deviation 
due  to  the  linearized  approximations. 

The  relationship  between  convergence  time  and  step  size  (or  feedback  coefficient) 
is  shown  in  Fig.4.10.  Fig.  4.10(a)  show  the  convergence  curves  corresponding  to  ten  dif- 
ferent a from  0.1  x 10  3 to  1.1  x 10-3.  The  relationship  between  convergence  time  and 
different  a is  shown  in  Fig.  4.10(b),  where  ‘x’  and  ‘* ’corresponds  to  the  analytical  result 
given  by  Eq.  4.23  and  simulation  results  respectively. 

The  relationship  between  convergence  time  and  initial  cutoff  frequency  /c(0)  is 
shown  in  Fig.  4.11.  Fig.  4.1 1(a)  show  the  convergence  curves  for  ten  different  /c(0)  from 
2200Hz  to  3500Hz.  The  relationship  between  convergence  time  and  different  /c(0)  is 
shown  in  Figure4.1 1(b),  where  ‘x’  and  ‘*’  correspond  to  the  analytical  and  simulation 
results  respectively. 

From  the  above  simulations,  we  observe  that  the  convergence  time  in  Eq.  4.23  is  a 
good  approximation  with  about  10%  deviation  for  the  nonlinear  system.  A more  accurate 
result  can  be  obtained  if  the  initial  frequency  is  close  to  its  final  value. 

The  following  simulations  are  designed  to  verify  the  derivation  for  steady  state 
error  given  by  Eq.  4.33  and  Eq.  4.39.  In  these  simulations,  the  input  signal  consists  of  two 
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Figure  4.8:  The  relationship  between  the  convergence  time  and  the  frequency  for  two 
sinusoids  example. 


T vs.  Ps 


Figure  4.9:  The  relationship  between  the  convergence  time  and  power  for  two  sinusoids. 


84 


1.2 

x ICf3 


(b) 

Figure  4.10:  The  relationship  between  the  convergence  time  and  a for  two  sinusoids,  (a) 
convergence  curve  with  different  a (decreasing  from  left  to  right);  (b)  relationship  between 
the  convergence  time  and  a. 
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Figure  4.11:  The  relationship  between  convergence  time  and  initial  cutoff  frequency,  (a) 
convergence  curve  with  different  initial  cutoff  frequency;  (b)  relation  between  convergence 
time  and  initial  cutoff  frequency. 
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sinusoids  with  different  frequencies  fy  and  f2,  amplitudes  Ay  and  A2  respectively,  and  the 
sampling  frequency  is  20/cHz.  Also,  we  take  the  mean  square  root  error  (MSRE)  as  the 
steady  state  error,  and  let  flc  be  the  cutoff  frequency  of  the  first-order  low  pass  filter  in  the 
power  computation.  First,  we  look  at  the  relationship  between  MSRE  and  the  feedback 
coefficient  a.  In  the  simulation,  fy  = 1500Hz,  f2  = 3000Hz,  Ay  = A2  = 1,  flc  = 300Hz, 
and  R = 0.25.  The  final  value  of  the  cutoff  frequency  (/c)  of  the  tenth-order  filter  is 
expected  to  be  fy  = 1500.  a is  changed  from  1 x 10~4  to  10  x 10~4.  The  simulation  result 
is  shown  in  Fig.  4.12(a).  From  the  figure,  we  can  see  that  the  simulation  result  is  close 
to  the  analytical  one  given  by  Eq.  4.33  with  around  (2%)  deviation,  and  observe  that  the 
MSRE  is  proportional  to  the  feedback  coefficient  a. 

In  the  second  simulation,  we  examine  the  relationship  between  MSRE  and  input 
power.  In  the  simulation,  a=lx  1CT4  is  fixed,  and  Ay  = A2  is  varied  from  \/l  to  a/To. 
The  simulation  result  is  shown  in  Fig.  4.12(b).  We  observe  that  simulated  result  is  close 
to  the  analytical  one  with  around  (2%)  deviation.  Clearly,  the  MSRE  is  proportional  to  the 
input  power. 

In  the  third  simulation,  we  want  to  see  the  relationship  between  the  MSRE  and 
the  dominant  input  frequency  /0.  We  set  a = lx  10“4,  and  A — 1 fixed,  and  let  fy 
vary  from  1500Hz  to  2500Hz.  As  a result,  the  dominant  frequency  f0  = f2-  fy  varies 
from  1500Hz  to  500Hz.  As  shown  in  Eq.  4.39,  the  MSRE  increases  with  /o  reduced.  The 
result  is  shown  in  Fig.  4.12(c).  From  the  figure,  we  can  see  that  the  simulated  result  agrees 
with  the  analytical  result  with  around  (1%)  deviation.  The  steady  state  error  is  inversely 
proportional  to  the  frequency  /0  of  the  dominant  sinusoid. 

In  the  last  simulation,  we  investigate  the  relationship  between  MSRE  and  fcl.  fcl 
is  set  to  vary  from  300Hz  to  1100Hz.  When  /cl  increases,  the  output  of  power  compu- 
tation for  dominant  sinusoid  increases,  then  the  MSRE  increases.  The  result  is  shown  in 
Fig.  4.12(d).  We  can  see  the  simulated  result  is  consistent  with  the  analytical  one  with 
around  (3%)  deviation. 
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As  a summary  of  steady  state  error  simulations,  we  observe  that  simulation  agrees 
with  the  analytical  derivation  given  in  Eq.  4.33  and  Eq.  4.39.  The  steady  state  error  is 
proportional  to  the  feedback  coefficient  a,  cutoff  frequency  /cl  of  the  power  computation, 
the  input  signal  power,  and  is  inversely  proportional  to  the  frequency  of  dominant  sinusoid. 


88 


Figure  4.12:  Steady  state  error  versus  several  variables  for  two  sinusoids  example,  (a) 
feedback  coefficient  a ; (b)  input  power;  (c)  frequency  /i  of  one  sinusoid  variable  (/2  = 
3000Hz);  (d)  cutoff  frequency  /cl  of  first-order  LPF  for  power  computation. 


CHAPTER  5 

ANALOG  CIRCUIT  DESIGN  OF  THE  RS 
5.1  Introduction 

In  this  chapter,  we  describe  the  circuit  architecture  for  the  analog  implementation 
of  the  RS,  and  the  circuit  design  of  each  building  block.  We  discuss  the  high-order  filter 
design  and  design  issues  such  as  cascade  sequence.  Also,  we  discuss  the  circuit  for  the 
short-time  power  computation  of  the  signal.  We  describe  the  design  and  simulation  of 
analog  to  digital  converter  with  successive  approximate  architecture.  Finally,  a prototype 
chip  was  fabricated  and  tested,  and  we  report  the  results  of  the  chip  test.  In  the  chip  test, 
both  assumed  signal  and  real  speech  signals  ware  applied.  The  experiments  show  that  the 
chip  can  successfully  extract  the  important  features  from  speech  and  other  signals. 

5.2  Circuit  Architecture 

In  this  section,  we  introduce  the  circuit  architecture  used  to  implement  the  system. 
Fig  5.1  shows  the  circuit  block  diagram  for  the  ratio  spectrum  computation  implemen- 
tation. The  power  computation  part  is  realized  by  applying  the  square  of  the  signal  to  a 
low-pass  filter  which  consists  of  a capacitor  Cx  and  an  OTA1 . The  ratio  R is  finished  by  the 
gain  and  bias  shift  block  which  is  also  used  as  an  adjustment  to  the  overall  gain  and  bias 
shift.  OTA2  and  C2  are  used  as  another  LPF  for  the  power  computation.  The  output  from 
the  integrator  consisted  of  OTA3  and  Cz  is  fedback  to  the  tenth-order  LPF  to  change  the 
bias  current  of  the  OTA  used  in  the  tenth-order  LPF  such  that  the  transconductance  g of  the 
OTA  in  the  LPF  is  changed  since  g is  proportional  to  the  square  root  of  the  bias  current  in 
the  saturation  region,  and  is  proportional  to  the  bias  current  in  the  subthreshold  region.  It  is 
designed  that  the  cutoff  frequency  luc  of  a tenth-order  LPF  is  proportional  to  the  transcon- 
ductance g of  each  OTA  of  the  LPF.  If  Pd  ^ RPS  (VA  ^ Vg),  that  is,  the  desired  power  of 
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the  signal  is  different  from  the  power  of  the  filtered  signal  (Pf),  the  error  signal  is  applied 
to  the  integrator  ( Vq)  to  adjust  the  g of  the  tenth-order  LPF,  and  thus  change  the  cut-off 
frequency  uc  until  the  error  is  zero.  At  convergence,  Pf  — RPd- 

Obviously,  there  are  many  ways  to  implement  each  building  block.  It  should  be 
mentioned  that  one  consideration  of  our  designs  is  that  the  circuit  for  each  building  block 
is  able  to  be  operated  in  both  the  subthreshold  region  and  the  above-threshold  region  for 
the  flexible  applications  [43],  which  influences  our  selection  and  design  of  the  circuits. 
Basically,  circuits  operated  in  the  subthreshold  region  result  in  low  power  consumption  but 
small  linear  input  range.  In  contrast,  the  circuits  operated  in  above-threshold  region  con- 
sume more  power  but  provide  large  linear  range.  Further  difference  is  worth  mentioning 
here.  The  tunable  transconductance  g in  subthreshold  and  saturation  is  proportional  to  the 
bias  current  and  square  root  of  bias  current  respectively.  As  a result,  the  OTA  operated 
in  the  subthreshold  can  provide  a much  wider  range  of  g than  that  in  the  above-threshold 
region.  In  our  design,  the  cutoff  frequency  of  the  LPF  is  desired  to  change  from  100Hz 
to  5kHz  due  to  speech  applications,  a much  larger  range  than  what  an  OTA  operated  in 
the  above-threshold  region  can  provide.  To  achieve  this  wide  range,  the  OTA  is  set  in 
subthreshold. 

5.3  Circuit  Design  and  Design  Considerations 
5.3. 1 Design  of  the  Tenth  Order  Butterworth  Low  Pass  Filter 

The  transconductance-C  filter  is  used  in  the  filter  design  due  to  its  low  power  and 
simple  tuning  capability.  There  are  several  architectures  available  for  general  analog  filter 
design  [44],  Basically,  there  are  three  general  methods.  They  are:  1.  Cascade  connection 
of  second-order  sections;  2.  Multiple-loop  feedback  circuits;  3.  Simulation  of  passive  LC 
ladder  networks. 

• In  method  1 , taking  advantage  of  the  useful  biquadratic  sections,  we  write  the  high- 
order  function  as  the  product  of  biquadratic  factors  that  we  realize  accordingly.  Next, 
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Figure  5.1:  The  block  diagram  of  the  RS  circuit. 

we  cascade  these  sections  by  connecting  the  output  of  each  section  to  the  input  of 
the  following  one.  This  method  has  the  advantages  of  simplicity  in  designing  and 
aligning  the  filter,  provided  that  the  output  of  each  section  is  of  very  low  impedance 
(practically  zero). 

• In  method  2,  multiple  feedback  and,  in  some  cases,  multiple  feed-forward  is  ap- 
plied in  a cascade  connection  of  biquadratic  sections.  This  coupling  leads  to  a better 
sensitivity  performance  of  the  overall  circuit  compared  to  the  corresponding  circuit 
obtained  by  Method  1 . However,  passive  inductors  are  exceedingly  difficult  to  im- 
plement in  VLSI. 

• In  method  3,  simulation  of  passive  resistance-terminated  lossless  ladder  networks 
can  be  achieved  by  simulation.  The  ladder  simulation  method  is  attractive,  because 
it  leads  to  active  circuits  of  lower  sensitivities  than  the  other  two  methods. 
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In  our  design,  we  use  the  cascade  connection  of  the  second-order  architecture  which 
has  a simple  and  easy  tuning  procedure  advantage,  as  shown  in  Fig.  5.2. 

The  frequency  response  of  an  Nth-order  Butterworth  low-pass  filter  can  be  written 
as 

= ittW  (5>) 

'C Jc' 

where  N = 10,  and  uic  is  the  cut-off  frequency.  Then 

\H(u)\2  = H(a)H(-8) \s=ju 
1 

■ 1 + (-1)N&N 
1 

- 1 + (_1)JV52JV  (5-2) 

where  5 = s/ujc  is  the  normalized  frequency. 

The  poles  are 


5fc  = -sin(^7r)+jcos(^17r),  k = 1,  • • • ,2N.  (5.3) 


We  take  the  poles  Sk  in  the  left  S-plane,  then 


H(S)  = 


i 

(S-S!)(S-52)-(5-5N) 


, A;  = 1, 2,  • • • , iV. 


(5.4) 


The  impulse  response  can  be  written  as 

N 

h(t)  = J2aie~Si“ct  (5.5) 

2=1 

where  the  a*  is  a complex  coefficient. 

Replacing  Sk  in  Eq.  5.4  with  Eq.  5.3,  the  transform  function  of  the  Nth-order  But- 
terworth LPF  becomes 


H(S) 


1 1 1 

S2  + 0.3128S1  + 1 S2  + 0.9085  + 1 52  + 1.4145  + 1 
1 1 


52  + 1.7825  + 1 52  + 1.97545  + 1 
T5  x T4  x T3  x x Xi 


(5.6) 
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Figure  5.2:  Tenth-order  Butterworth  LPF  with  variable  cutoff  frequency. 
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Figure  5.3:  The  block  diagram  of  2nd  order  LPF. 

Therefore,  it  can  be  implemented  by  five  second-order  filters  cascaded  together.  A circuit 
for  the  second-order  filter  is  shown  in  Fig.  5.3.  The  transform  function  of  the  circuit  is 


H2(s) 


1 

TlT2S2  + r2s  + 1 


(5.7) 


where  tx  = cl/gl  and  r2  - c2/g2. 

Five  second-order  filters  are  cascaded  to  implement  the  tenth-order  Butterworth 
LPF.  Therefore,  the  transform  function  of  the  tenth-order  Butterworth  LPF  is 


H10(s) 


1 1 1 

T\T2s2  + t2s  + 1 T3T4S2  + t4s  + 1 t5t6s2  + t6s  + 1 

1 1 

r7r8s2  + r8s  + 1 r9r10s2  + Ti0s  + 1 


(5.8) 


where  A = a/gu  1 < i < 10.  The  parameter  selections  on  capacitance  C and  transcon- 
ductance gi  are  flexible  as  long  as  Eq.  5.8  realizes  Eq.  5.6.  For  simplicity  in  our  design,  we 
set  the  transconductances  of  all  ten  OTAs  same.  That  is,  gx  = g2  = • • • = gw  = g.  For  the 
first  second-order  term,  we  have 

1 


(5.9) 
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and 


0.3128 


72  = 


(5.10) 


Then 


0.3128  uc 


(5.11) 


and 


C2  = 0.3128— 

Ldc 

Similarly,  for  other  second-order  terms,  we  have: 


(5.12) 


(5.13) 


or 


(5.14) 


i 


where  ku  i = 1 : 5 is  a coefficient  and  k(  1 : 5)  = [0.3128, 0.908, 1.414, 1.782, 1.9754]. 

For  example,  if  we  select  gi  = gm  = lOuA/v,  and  fc  = lk/2nHz,  then  Cj  = 


C7  = 17.82uF,  C8  = 5.612uF,  C9  = 19.752uF,  Ci0  = 5.062uF. 

From  Eq.  5.14,  the  cutoff  frequency  ujc  of  the  LPF  is  proportional  to  the  transcon- 
ductance g of  the  OTA  used  in  the  tenth-order  Butterworth  LPF.  In  our  design,  the  transcon- 
ductance g is  adjusted  by  changing  the  bias  current  of  the  OTA  in  the  LPF,  which  is 
achieved  by  adjusting  the  bias  voltage  Vb  of  the  OTA  shown  in  Fig.  5.6.  In  the  RS  cir- 
cuit system,  the  Vb  is  from  the  error  between  the  expected  power  and  the  actual  power. 

One  issue  on  the  high-order  filter  design  using  a cascade  is  the  cascade  sequence. 
Cascade  sequence  greatly  affects  the  dynamic  range  of  the  corresponding  second-order 
filter  and  consequently  that  of  the  whole  filter.  The  frequency  responses  of  five  second- 
order  filters  are  shown  in  Fig.  5.4.  To  obtain  the  optimum  sequence  in  terms  of  high 
dynamical  range,  there  exist  5!  different  sequence  possibilities,  which  must  be  examined. 
It  can  be  shown  that  optimum  placement  requires  that  the  7j  stage  is  put  in  the  first  stage 


3.128uF  and  C2  = 31.969uF,  C3  = 9.08uF,  C4  = 11.013uF,  C5  = 14.142uF,  C6  = 7.07uF, 
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Cascade  Sequence 


Figure  5.4:  Frequency  response  of  2nd  order  filters. 


and  that  the  T5  is  put  as  the  last  stage.  To  achieve  the  maximum  undistorted  range,  one 
of  the  optimum  cascade  sequence  is  Tu  T2,  T3,  T4,  and  T5.  We  can  examine  the  sequence 
by  assuming  a sinusoid  input  with  amplitude  1 and  the  undistorted  range  1 from  Tx  to  T5 
stage.  Then,  the  output  amplitude  of  Tx  is  less  than  or  equal  to  1;  and  the  output  ofTj  is  the 
input  of  stage  T2.  The  output  amplitude  of  T2  is  also  less  than  or  equal  to  1;  and  so  on.  In 
this  way,  we  can  see  that  the  maximum  undistorted  range  of  signal  is  1.  On  the  contrary,  if 
the  T5  is  placed  in  first  stage,  the  maximum  output  amplitude  of  this  stage  is  larger  than  1, 
consequently  causing  distortion  in  the  following  stages.  As  a result,  the  overall  input  range 
without  distortion  is  less  than  1.  It  should  be  mentioned  that  the  arrangement  is  optimal 
in  terms  of  maximum  undistorted  range  for  stable  frequency  response  for  sinusoidal  input. 
However,  the  more  useful  distortion  analysis  for  arbitrary  signals  is  a more  difficult  issue 
[45],  and  left  to  future  research. 
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5.3.2  Power  Computation 

The  signal  power  computation  block  is  achieved  by  applying  the  square  of  a signal 
to  a first-order  low-pass  filter.  The  first-order  LPF  in  power  computation  consists  of  an 
OTA  and  a capacitor  shown  in  Fig.  5.1  in  which  an  OTA1  and  Cx  forms  a first-order  LPF 
while  an  OTA2  and  C2  forms  another  first-order  LPF.  Using  first-order  LPF,  we  keep  DC 
component  and  neglect  other  AC  components,  thus  obatin  the  power  of  a signal.  Another 
reason  to  use  the  first-order  LPF  for  power  computation  is  that  it  provides  an  exponential 
window  for  a signal,  which  is  widely  used  in  dealing  with  nonstationary  signals,  such  as 
speech  signal.  It  should  be  pointed  out  that  the  time  constant  of  the  LPF  has  an  effect  on 
the  power  computation.  Next,  we  discuss  the  selection  of  the  time  constant  and  its  effect. 

Assume  the  input  is  a sinusoid  signal  given  by  s{t)  = Acos(uji),  and  the  impulse 
response  of  the  first-order  low-pass  filter  (LPF)  is  given  by 

hit)  = (5.15) 


where  r is  the  time  constant  of  the  LPF.  Then,  the  output  of  LPF  for  power  computation  is 


s2{l)h{t  — l)dl 


A2 

2(1  + 4lu2t2) 


(cos  2 ut  + 2 lut  sin2uit) 


A2  A2 

2 6 " 2(1  + 4u;2r2)e  " 


(5.16) 


In  the  above  equation,  the  last  two  terms  are  the  transient  responses,  they  become 
very  small  after  time  r.  The  first  two  terms  indicate  the  steady  state  responses.  The  first 
term  is  the  power  expected,  and  the  second  term  is  the  oscillating  signal  cased  by  the  finite 
time  constant.  If  t > t,  the  last  two  terms  are  very  small  and  can  be  neglected,  then 


P(t) 


A2  A2 

2 +2(l  + 4 uj2t2) 
A2  A2 

2 2(1  + 4u;2r2) 


(cos  2 tut  + 2 ujt  sin  2 cut) 


(cos  2 cot  + (/>) 


(5.17) 
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Figure  5.5:  Output  waveform  of  LPF  for  power  computation,  (a)  cjc  = 0.005;  (b)  uc  = 
0.010;  (c)  wc  = 0.02. 
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From  Eq.  5. 1 6 , we  can  see  that  the  r should  be  small  for  a short  transient  response, 
but  the  resulting  oscillating  amplitude  is  large.  On  the  contrary,  if  r is  large,  then  the 
oscillating  amplitude  is  small.  Therefore,  the  tradeoff  has  to  be  made  in  selecting  the  first- 
order  LPF  time  constant  r. 

To  see  the  effect  of  different  time  constants  r,  a simulation  result  is  shown  in 
Fig.  5.5.  In  the  simulation,  the  input  is  a sinusoid  with  amplitude  1 and  frequency  1kHz. 
The  sampling  frequency  is  20/cHz.  The  LPF  is  a first-order  Butterworth  filter  for  power 
computation.  Three  different  cutoff  frequencies  (uc  = 0.005,0.01,0.02)  are  used  in  the 
simulation  to  see  the  tradeoff  between  oscillation  and  transient  time.  The  longest  transient 
and  the  smallest  oscillation  amplitude  correspond  to  uc  = 0.005  shown  in  Fig.  5.5(a). 
Longer  transient  and  smaller  oscillation  amplitude  occur  when  r increases,  which  is  con- 
sistent with  our  analysis.  Therefore,  in  selection  of  the  time  constant,  a tradeoff  is  made 

between  short  transient  time  and  large  oscillation  (steady  state  error). 

5.3.3  Error  Integrator 

The  error  integrator  consists  of  an  OTA  and  a capacitor  (OTA3  and  C3  in  Fig.  5.1). 
OTA3  is  used  to  convert  the  difference  of  two  input  voltage  signals  to  currents,  which  is 
integrated  by  a capacitor  C:i  and  converted  to  a voltage  output  Vc.  Vc  is  then  applied  to 
the  Gate  Vt  of  the  OTA  shown  in  Fig.  5.6  used  in  tenth-order  LPF,  where  Vf,o  is  used  for 
initial  setting.  Consequently,  the  current  bias  can  be  changed,  which  equivalently  changes 
the  cutoff  frequency  of  a tenth-order  LPF. 

A wide-range  OTA  [46]  shown  in  Fig.  5.7  is  used  since  the  range  of  error  voltage 
may  be  larger  than  that  provided  by  a simple  OTA.  Also,  we  can  tune  the  transconductance 
of  OTA3  to  obtain  a variable  feedback  coefficient  and  thus  to  adjust  the  convergence  time. 
Therefore,  OTA3  and  C3  are  used  to  integrate  the  error  which  is  then  fedback  to  change 
the  current  bias  of  the  tenth-order  LPF,  which  consequently  changes  the  transconductance 
g and  cut-off  frequency  u>c  of  the  tenth-order  LPF. 


Vdd 


Figure  5.6:  Simple  OTA. 


Vcc 


Wide-range  OTA 

Figure  5.7:  Wide  range  OTA. 
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Vdd 


Figure  5.8:  Gilbert  multiplier. 

5.3.4  Analog  Multiplier  Circuit  and  Buffer  Circuit 

As  discussed  previously,  the  signal  power  computation  block  is  achieved  by  apply- 
ing the  square  of  a signal  to  a first-order  low-pass  filter.  There  exist  several  methods  for 
square  computation  [47]-  [50],  The  square  in  our  system  requires  that  the  input  is  voltage 
signal  and  that  the  output  is  current  signal.  Also,  this  square  calculation  can  be  operated 
in  both  the  above-threshold  and  the  subthreshold  regions.  In  order  to  meet  the  above  re- 
quirements , the  Gilbert  multiplier  shown  in  Fig.  5.8  is  selected  to  achieve  the  square.  In 
the  multiplier,  the  square  is  realized  by  setting  the  two  inputs  equal  (in  term  of  AC)  except 
different  DC  voltages. 
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It  can  be  shown  [46]  that,  in  the  subthreshold  region,  we  have 

r r . u k(V,  V2)  , k(V3  - VA) 

Lout  — h tanh tanh  — - 


(5.18) 


The  output  current  is  approximately  proportional  to  the  multiplication  of  two  input  volt- 
ages when  the  input  voltage  is  small.  In  addition,  in  the  above-threshold  region,  it  can  be 
shown  that  the  output  is  also  approximately  proportional  to  the  multiplication  of  two  input 
voltages. 

To  separate  the  output  of  the  square  circuit  and  the  input  of  the  first-order  LPF,  a 
Buffer  is  used.  The  Buffer  circuit  is  shown  in  Fig.  5.9,  where  the  current  input  is  from  the 
output  of  the  multiplier  and  output  current  is  applied  to  the  first-order  LPF.  In  this  circuit, 
lout  = hn  is  designed,  and  the  voltage  at  output  node  can  be  different  from  the  voltage  at 
the  input  node.  As  a result,  the  voltages  at  the  A and  B nodes  in  Fig.  5.1  can  be  changed 

more  widely  without  affecting  the  multiplier,  where  output  voltage  swing  is  limited  [46]. 
5.3.5  Gain  Attenuation  or  Ratio  and  DC  Bias  Shift  Circuits 


The  circuit  shown  in  Fig.  5.10  has  two  functions.  It  can  be  shown  that 


K 


out 


VC2  + ^-( Vin  — Vel) 

9 m2 


(5.19) 
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As  a result,  the  ratio  can  be  achieved  by  changing  the  current  bias  of  one  OTA  or  two  OTAs 
in  the  circuit.  In  addition,  the  output  voltage  DC  bias  can  be  different  from  that  of  the  input 
from  Vcl  to  VC2,  such  that  it  can  function  as  the  DC  bias  shift.  In  the  multiplier  block,  two 
different  DC  bias  voltages  are  needed,  so  we  can  achieve  two  functions  with  this  circuit. 


5.4  ADC  Circuit  Design 

The  architectures  for  realizing  the  analog  to  digital  converter  (ADC)  can  be  roughly 
divided  into  three  categories.  One  category  is  the  A - S or  oversampling  ADC.  The 
oversampling  ADC  relaxes  the  requirements  placed  on  the  analog  circuitry  at  the  expense 
of  more  complicated  digital  circuitry.  The  high  accuracy  is  achieved  at  the  expense  of 
speed.  It  is  very  popular  in  high  resolution  and  medium  to  low  speed  applications.  Another 
category  is  high  speed  and  medium  to  low  resolution  ADCs  such  as  flash  ADCs. 

Successive  approximation  converters  are  one  of  the  most  popular  approaches  for 
realizing  AD  converters  due  to  their  reasonably  fast  conversion  time,  yet  moderate  circuit 
complexity  and  low  power.  For  our  system,  successive  approximation  converter  with  8 bit 
resolution  is  designed  and  simulated. 
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Figure  5.11:  Diagram  of  SAR  ADC. 

5.4.1  Architecture 

The  architecture  we  use  is  shown  in  Fig.  5.1 1,  which  can  be  found  in  p492  of  [41], 
In  total,  9 capacitors  are  used  in  the  architecture,  where  C8  = 2 C7  = • • • = 27C2  = 28C'1. 
The  operation  includes  the  sample  mode,  hold  mode  and  bit  cycling.  In  sample  mode,  all 
capacitors  are  charged  to  Vin  while  the  comparator  is  being  reset  to  its  threshold  voltage. 
The  capacitor  array  performs  the  sample-and-hold  operation.  In  hold  mode,  the  comparator 
is  taken  out  of  reset  by  opening  Si,  and  then  all  the  capacitors  are  switched  to  ground.  This 
causes  Vx  (negative  input  of  the  comparator),  which  is  originally  zero,  to  change  to  -Vin, 
thereby  holding  the  input  signal  Vin  on  the  capacitor  array.  Finally,  S2  is  switched  so  that 
Vref  can  be  applied  to  the  capacitor  array  during  bit  cycling.  In  bit  cycling  mode,  the  largest 
capacitor  (C8)  is  switched  to  Vref.  Vx  now  goes  to  (~VX  + Vref/2).  lfVx  is  negative,  then 
Vin  is  greater  than  Vref/ 2,  and  the  MSB  capacitor  is  left  connected  to  Vref.  Also,  bit8  ( b8 ) 
is  taken  to  be  1.  Otherwise,  the  MSB  capacitor  is  reconnected  to  ground  and  b8  is  taken  to 
be  0.  This  process  is  repeated  N times,  with  a smaller  capacitor  being  switched  each  time, 

until  the  conversion  is  finished. 

5.4.2  Circuit  Description 

The  comparator  used  in  the  ADC  is  shown  in  Fig.  5. 12,  which  can  be  found  on  page 
693  of  [51].  This  comparator  consists  of  three  stages.  The  function  of  the  first  stage  is  to 
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Figure  5.12:  Circuit  of  comparator. 


of  COMP 


Figure  5.13:  Basic  control  block. 
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amplify  the  signal  with  gain  1000.  The  second  stage  is  a positive  feedback  circuit  used  to 
decide  signal  sign.  The  last  stage  is  the  output  stage. 

Fig.  5.13  shows  one  of  the  eight  control  circuits  connected  to  each  switch  Sj  (1  < 
i < 8),  while  the  timing  of  the  ADC  is  shown  in  Fig.  5.14.  The  control  circuit  and  tim- 
ing are  designed  so  that  the  whole  ADC  operates  in  sample  mode,  hold  and  bit  cycling 
repeatedly.  In  Fig.  5.13,  s,  is  connected  to  one  end  of  each  capacitor,  and  is  designed  to 
be  connected  to  Ground  and  Sn8  as  required.  The  St  is  connected  to  Sng  during  sample 
mode,  and  is  connected  to  Ground  during  hold  mode.  In  bit  cycling  mode,  the  output  of 
comparator  is  connected  to  a D flip-flop  to  make  the  switch  s*  connected  to  Sn8  or  Vref  if 
the  output  of  comparator  is  1,  otherwise  to  make  the  switch  S';  connected  to  ground  if  out- 
put of  the  comparator  is  0.  Pdi  is  a pulse  signal  whose  rising  edge  activates  the  D flip-flop. 
Also,  the  functions  of  all  nodes  are  shown  in  Fig.  5.13.  The  timing  is  shown  in  Fig.  5.14, 
where  vpc,  vppl  and  vp22  are  used  to  make  the  switches  Si  and  S2  operated  in  sample 
and  hold  modes  properly,  vpl  and  vp2  are  connected  to  PI  and  P2  in  each  control  block 
and  make  each  operate  in  sample  and  hold  modes  correctly.  vd8  to  vdl  are  designed  to 
determine  Si  either  connected  to  Ground  or  Vref.  The  «th  bit  is  taken  from  the  Q of  the  SR 
shown  in  Fig.  5.13. 

5.4,3  Simulation  of  the  ADC 

The  simulations  for  the  performance  evaluation  of  the  designed  ADC  are  shown  in 
Fig.  5.15.  The  input  is  a sinusoid  with  magnitude  2V  and  frequency  1.5kHz.  The  sampling 
frequency  is  30kHz.  We  can  see  that  result  is  almost  ideal.  This  shows  that  our  circuit  and 
timing  work  very  well  with  ( MSE  < 1%). 

5.5  Simulation  and  Chip  Test  of  RS  Chip 
5.5.1  Simulation  Results 

The  frequency  response  of  the  tenth-order  LPF  is  shown  in  Fig.  5.16.  The  solid  line 
and  stars  correspond  to  the  ideal  result  and  circuit  HSPICE  simulation  result  respectively. 
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Figure  5.14:  Timing  diagram  of  the  ADC. 
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We  can  see  the  simulation  result  is  close  to  the  ideal  result.  The  small  deviation  is  caused 
by  the  nonlinear  features  of  the  circuit  especially  for  the  high-order  filter. 

Fig.  5.17  show  the  square  circuit  simulation  result  by  using  the  circuit  shown  in 
Fig.  5.8.  The  bias  voltages  of  V3  and  V\  are  set  to  3.7V  and  2.5V  respectively,  while  the 
amplitudes  of  V3  and  Vi  are  the  same.  Therefore,  the  current  at  node  Vout  is  approximately 
equal  to  the  square  of  V\ . The  result  shows  it  is  close  to  the  square  operation  especially  for 
small  inputs. 

Fig.  5.18  shows  the  relationship  between  the  ratio  R and  the  cutoff  frequency  ujc 
when  the  input  signal  is  a sinusoid  with  1000Hz  and  magnitude  20  mV.  The  solid  line 
and  stars  represent  the  analytic  results  and  HSPICE  simulation  results,  respectively.  We 
observe  that  the  results  are  very  close  with  some  small  deviations  due  to  the  nonlinear 
circuit  behavior  and  parameter  sensitivity  of  the  high-order  LPF. 
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8 bits  ADC  Circuit  result 


Figure  5.15:  Simulation  result  for  the  ADC. 


An  HSPICE  simulation  of  the  transient  waveform  of  the  voltage  signal  representing 
loc  is  shown  in  Fig.  5.19.  The  input  signal  consists  of  three  consecutive  sinusoids.  The  first 
part,  from  time  0 to  200ms,  is  a sinusoid  with  / = 1800Hz;  in  the  second  part  from  200ms 
to  400ms,  the  input  switches  to  a sinusoid  with  / = 1200Hz;  and  the  final  part  is  a sinusoid 
with  / = 1800Hz.  We  observe  that  circuit  works  as  expected.  The  convergence  time  from 
1.2&Hz  to  1.8A;Hz  is  35ms,  longer  than  that  (25ms)  from  1.8A;Hz  to  1.2£;Hz,  which  agrees 

with  the  analysis  discussed  in  the  last  chapter. 

5.5.2  Chip  Test  Results 

A prototype  RS  chip  was  fabricated  using  the  0.5um  AMI  process  technology  and 
the  power  supply  voltage  used  is  5V.  The  layout  of  the  chip  is  shown  in  Fig.  5.20.  As 
mentioned  before,  the  chip  is  designed  to  be  able  to  operate  in  both  subthreshold  and  above- 
threshold regions  for  flexible  applications,  which  can  be  simply  achieved  by  changing  the 
initial  bias  current.  In  the  test,  all  circuit  blocks  are  set  to  operate  in  the  subthreshold 
region.  And  the  initial  bias  voltage  Ho  of  OTA  used  in  the  tenth-order  LPF  is  set  to  be 
about  0.8V,  while  another  bias  voltage  H is  obtained  adaptively  from  the  error  integrator. 
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Figure  5.16:  HSPICE  transform  function  of  the  tenth-order  Butterworth  LPF. 


Figure  5.17:  The  simulation  result  for  the  square  circuit. 
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R VS  frequency 


Figure  5.18:  HSPICE  simulation  results  of  Ratio  vs.  cutoff  frequency. 


TRANSIENT  WAVEFORM 


Figure  5.19:  HSPICE  voltage  feedback  waveform  for  three  consecutive  sinusoids. 
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Figure  5.20:  The  Layout  of  RS  chip. 


Also,  Vb  is  measured  and  converted  into  the  cutoff  frequency  of  the  tenth-order  low  pass 
filter.  In  order  to  evaluate  the  chip  performance,  in  the  experiments  we  compute  the  ratio 
spectrum  for  both  sinusoids  and  real  speech  signal.  Also,  we  will  measure  the  convergence 
time.  The  signal  frequency  used  in  the  testing  is  100Hz  to  5kHz  since  this  is  the  range  of 
useful  frequencies  in  speech  signals.  In  the  chip  testing,  the  system  offset  is  cancelled  in 
calibration  by  tuning  the  voltages  Vbl  and  Vb2  shown  in  Fig.  5.1. 

First,  we  measure  the  frequency  response  of  the  tenth-order  filter.  The  cutoff  fre- 
quency of  the  tenth-order  LPF  is  set  at  1kHz  by  tuning  the  bias  current  of  the  OTA  used  in 
the  LPF.  The  input  is  a sinusoid  with  amplitude  20mV  and  with  variable  frequency.  The 
frequency  response  of  a tenth-order  LPF  is  shown  in  Fig.  5.21.  The  solid  line,  stars  and  ’o’ 
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Transform  function  of  tenth-order  Butterworth  filter 


Figure  5.21 : Frequency  response  of  tenth-order  Butterworth  LPF. 


correspond  to  the  analytic  result,  circuit  simulation  result  and  chip  test  result  respectively. 
We  can  see  the  chip  result  is  close  to  the  analytical  result.  The  small  deviation  (around 
1%)  is  caused  by  the  inaccurate  capacitances  and  the  nonlinearity  of  the  circuits  in  the 
high-order  filter. 

Next,  we  apply  a sinusoid  to  the  chip  input,  and  set  the  ratio  from  0.1  through 
0.9,  then  measure  the  voltage  Vc  which  corresponds  the  cutoff  frequency.  Fig.  5.22  shows 
the  ratio  spectrum  when  the  input  signal  is  a sinusoid  with  1000Hz  and  amplitude  20  mV. 
The  solid  line,  stars  and  ’o’  the  represent  the  analytic  results,  cadence  simulation  and  chip 
results  respectively.  We  observe  that  the  results  are  very  close  with  some  small  deviations 
(around  1%).  The  deviations  become  large  when  the  ratio  is  small.  In  this  case,  the  signal  is 
weak,  equivalently  the  SNR  is  small.  As  a result,  the  circuit  is  more  sensitive  to  nonlinear 
circuit  behavior  and  parameter  deviations. 
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Ratio  spectrum  for  one  sinusoid 


Figure  5.22:  Ratio  spectrum  for  a sinusoid  input. 


The  following  experiment  is  to  evaluate  the  convergence  time.  It  is  very  important 
for  real  time  speech  processing  since  the  typical  duration  of  a speech  phoneme  is  only 
around  80ms.  The  dynamic  cutoff  frequency  waveform  luc  (converted  from  is  shown  in 
Fig.  5.23,  where  the  input  signal  consists  of  three  consecutive  sinusoids.  The  first  part,  from 
time  0 to  100ms,  is  a sinusoid  with  / = 1000Hz;  for  the  second  part  from  100ms  to  200ms, 
the  input  switches  to  a sinusoid  with  / = 2000Hz;  and  the  third  part  is  a sinusoid  with 
/ = 1000Hz.  We  observe  that  circuit  works  as  expected.  In  the  experiment,  the  measured 
data  is  stored  in  oscilloscope  and  transformed  into  PC,  and  there  is  quantization  error.  As 
a result,  we  can  see  some  oscillation  like  noise  in  the  figure.  The  true  plot  in  oscilloscope 
is  much  smoother  than  what  we  see.  The  convergence  time  from  IkHz  to  2KHz  is  20ms, 
shorter  than  that  (35ms)  from  2kHz  to  1/cHz,  which  agrees  with  the  analysis  discussed 
in  the  last  chapter.  Similarly,  Fig.  5.24  shows  the  dynamic  cutoff  frequency  waveform 
when  the  input  consists  of  three  sinusoids  with  frequency  from  3k-2k-3k  alternatively. 
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Figure  5.23:  Dynamical  voltage  waveform  for  three  consecutive  sinusoids  input  with  fre- 
quency 1kHz  and  2KHz  alternatively. 

We  can  see  the  time  from  2k-3k  is  45ms  while  the  convergence  time  is  around  35ms. 
The  longer  the  convergence  time,  the  higher  the  frequency.  These  results  agree  with  the 
analysis  in  chapter4  . In  addition,  we  can  see  it  may  be  used  in  speech  processing  in  real- 
time since  the  typical  convergence  time  can  be  less  than  50ms,  which  is  smaller  than  the 
duration  of  typical  speech  phoneme  signal.  The  small  oscillation  in  the  figure  is  related 
to  the  feedback  coefficient  and  the  cutoff  frequency  of  first-order  LPF  used  in  the  power 
computation,  which  is  discussed  in  the  last  chapter.  The  smaller  level  of  oscillation,  the 
smaller  the  feedback  coefficient  and  the  smaller  the  cutoff  frequency  of  the  first-order  LPF. 
But  the  convergence  time  will  be  longer. 

The  final  experiment  is  to  compute  the  ratio  spectrum  for  real  speech  signals  from 
the  TIMIT  data  base,  which  is  widely  used  to  evaluate  speech  recognition  algorithms.  The 
experiment  results  for  ratio  spectrums  using  Phoneme  EH  and  IY  are  shown  in  Fig.  5.25 
and  Fig.  5.26  respectively,  where  Figure  (a),  (b)  and  (c)  correspond  to  the  time  waveform, 
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Figure  5.24:  Dynamical  voltage  waveform  for  three  consecutive  sinusoids  input  with  fre- 
quency 2kHz  and  3KHz  alternatively. 

power  spectrum  and  ratio  spectrum  respectively.  In  Fig.  5.25(c),  the  RSI  is  obtained  by  us- 
ing the  RS  architecture  shown  in  Fig.  4.5,  while  the  RS2  is  obtained  by  using  the  definition 
given  in  Eq.  4. 1 in  which  H (cu)  is  the  transform  function  of  the  tenth-order  Butterworth 
LPF.  Obviously,  the  RSI  is  the  time  domain  method  while  the  RS2  is  the  frequency  do- 
main method.  They  are  different  since  a first-order  LPF  is  used  to  compute  the  power  in 
RSI  rather  than  ideal  rectangular  function.  But,  the  difference  is  very  small  as  we  see 
in  the  figure,  since  RSI  is  obtained  by  averaging  over  the  entire  data  and  the  RS2  is  also 
obtained  by  using  the  whole  data.  It  should  be  pointed  out  that  the  RS2  plot  is  included  in 
the  figure  just  for  the  purpose  of  comparison  since  the  time  domain  method  rather  than  the 
frequency  domain  method  is  used  in  the  circuit  implementation  of  the  RS.  To  evaluate  the 
performance  of  the  RS  chip,  it  is  enough  to  compare  the  RS  1 and  the  test.  We  observe  that 
deviations  between  RSI  and  the  test  are  within  7%  due  to  circuit  nonlinearity.  However, 
the  main  features  for  those  signals,  such  as  three  formants,  can  be  seen  clearly.  Therefore, 
the  experiment  shows  that  we  can  recognize  two  phonemes  by  using  their  ratio  spectrums 
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from  the  chip.  In  addition,  we  can  see  the  cutoff  frequency  of  the  tenth-order  LPF  is  ad- 
justed over  a wide  range  (50  times)  from  100Hz  through  5/cHz.  It  should  be  pointed  out 
that  the  deviation  can  be  smaller  if  the  range  of  changeable  cutoff  frequency  is  smaller, 
since  it  is  hard  to  get  better  linearity  for  such  wide  range  DC  bias  change.  The  power  con- 
sumption of  whole  chip  depends  on  the  signal  input  frequency.  As  a example,  the  power  at 
300Hz  is  around  2 uA  x 5V  = 10 uW . 
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Speech  signal:  EH 
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Figure  5.25:  Speech  phoneme  EH.  (a)  Waveform,  (b)  Power  spectrum,  (c)  Ratio  Spectrum. 
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Speech  signal:  IY 
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Figure  5.26:  Speech  phoneme  IY.  (a)  Waveform,  (b)  Power  spectrum,  (c)  Ratio  Spectrum. 


CHAPTER  6 

CONCLUSIONS  AND  FUTURE  WORK 
6.1  Conclusions 

We  have  studied  the  power  spectrum,  the  integrated  spectrum  and  the  ratio  spec- 
trum, and  discussed  the  their  properties.  We  also  presented  the  applications  using  the  RS 
in  detection  and  parameter  estimation.  In  addition,  we  designed,  implemented  and  tested 
an  improved  analog  IC  for  the  RS.  Some  progress  and  contributions  can  be  summarized  as 
below: 

6.1.1  Signal  Processing 

We  show  an  important  result  that  the  integrated  spectrum  and  the  ratio  spectrum 
are  consistent  estimators  unlike  the  widely-used  power  spectrum.  This  property  makes 
the  integrated  spectrum  and  the  ratio  spectrum  more  useful  for  some  applications  than  the 
conventional  power  spectrum.  We  derive  such  properties  as  the  mean  and  the  variance,  and 
the  distributions.  In  addition,  we  derive  the  mean  and  variance  of  the  ratio  spectrum  and 
obtain  the  approximate  closed-form  solution,  which  is  considered  difficult  in  [4],  [23].  All 
derivations  are  verified  by  computer  simulations. 

We  model  the  window  function  as  a Gaussian  function  when  a Butterworth  low- 
pass  filter  is  used.  Also  we  show  the  function  of  window  is  to  reduce  the  variance  of  signal 
spectrum,  The  window  with  Gaussian  shape  is  close  to  the  optimum  window  widely  used 
to  reduce  the  variance  of  the  signal  spectrum. 

The  approximate  quantitative  relationship  between  uniform  sampling  on  the  fre- 
quency axis  and  uniform  sampling  on  the  ratio  axis  is  derived,  and  verified  by  simulations. 
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A detection  algorithm  for  unknown  signals  using  the  RS  has  been  proposed.  In 
the  algorithm,  we  sample  the  ratio  axis  to  enhance  the  signal  to  noise  ratio  such  that  the 
detection  performance  can  be  improved.  We  also  analyze  the  detection  performance.  Fur- 
thermore, we  show  that  the  performance  of  the  algorithm  is  better  than  the  conventional 
GLRT  with  fewer  assumptions  made.  In  addition,  we  propose  a detection  algorithm  using 
wavelet  thresholding,  which  is  a popular  algorithm  in  the  wavelet  area,  and  analyze  its 
performance  and  restrictions. 

We  propose  an  algorithm  for  parameter  estimate  in  the  RS  domain.  Due  to  the 
consistent  property  of  the  RS,  we  show  that  better  performance  can  be  achieved  compared 

with  conventional  methods. 

6.1.2  Analog  IC  Design 

An  improved  analog  IC  for  the  analog  implementation  of  the  RS  has  been  designed 
and  implemented.  We  conduct  the  chip  test  for  both  assumed  signals  and  real  speech 
signals.  In  the  chip  test,  1%  deviation  was  achieved  for  a pure  sinusoid  signal  input,  and 
less  than  7%  deviation  was  achieved  for  real  speech  signals.  The  experiments  show  that 
the  main  features  can  be  recognized  by  using  the  RS  obtained  from  the  chip.  We  believe 
that  the  analog  ratio  spectrum  offer  an  appealing  alternative  method  for  conventional  filter 
banks  for  signal  processing  applications.  The  circuits  differs  from  previous  work  in  the 
following  ways: 

In  the  implementation  of  the  RS,  we  analyze  the  convergence  performance,  which 
is  important  for  all  adaptive  system.  Using  piecewise  linear  techniques,  we  derive  equa- 
tions for  computing  the  convergence  time.  Also,  we  derive  the  approximate  expression  for 
steady  state  error.  These  derivations  are  verified  by  simulations  and  chip  measurements. 

We  design  and  implement  a tenth-order  Butterworth  gm-c  LPF,  and  the  cutoff  fre- 
quency of  the  LPF  can  be  widely  adjusted  from  100Hz  to  5KHz.  We  describe  the  circuits 
and  discuss  their  design  considerations,  such  as  the  cascade  sequence  issue  for  the  filter. 
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In  the  power  computation  circuit,  we  use  a square  and  LPF  to  form  the  power 
computation  rather  than  an  absolute  value  block.  We  also  discuss  the  selection  of  the 
cutoff  frequency  of  the  LPF.  One  of  the  advantages  using  this  design  is  that  the  short-time 
window  needed  for  nonstationary  signals  can  be  achieved  with  this  design. 

In  the  attenuation  block  to  obtain  the  ratio,  two  OTAs  are  used  to  achieve  improved 
linearity  rather  than  previous  work. 

We  have  designed  an  analog  to  digital  converter  with  the  successive  approximation 
architecture.  This  architecture  is  popular  for  many  years  since  it  has  reasonable  quality, 
intermediate  complexity  and  low  power. 

6.2  Future  Work 

Though  a lot  of  work  has  already  been  completed  and  described  in  this  dissertation, 
there  are  still  many  questions  that  remain  to  be  answered.  We  describe  them  in  both  the 

signal  processing  and  the  analog  IC  domain. 

6.2.1  Signal  Processing 

The  applications  in  other  areas  such  as  detection,  spectrum  estimate,  parameter 
estimation  and  identification  need  to  be  investigated  further.  We  believe  that  more  appli- 
cations exist  since  the  RS  is  a consistent  estimate,  and  there  are  advantages  of  sampling 

along  the  ratio  axis. 

6.2.2  Analog  IC 

The  chip  performance  can  be  further  improved  by  more  design  enhancements.  First 
of  all,  the  AGC  (automatical  gain  control)  should  be  included  in  the  system  in  order  to 
improve  the  performance  especially  when  the  signal  is  weak  and  the  ratio  is  small.  The 
circuit  operates  in  a nonlinear  region  if  the  signal  magnitude  is  too  large.  To  avoid  the 
nonlinearity,  the  gain  should  be  small.  As  a result,  the  unexpected  components  such  as 
remaining  offset  and  noise  may  be  dominant  if  the  signal  is  small  and  the  gain  is  small. 
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Therefore,  AGC  is  needed  if  we  want  to  obtain  better  performance.  We  believe  that  the 
gain  may  be  controlled  by  the  variance  of  speech  signal  rather  than  its  peak. 

Finally,  an  efficient  OTA  with  wide  linear  range  and  low  power  consumption  should 
be  used  in  the  circuits.  It  may  be  difficult  to  design  these  OTAs  if  the  OTA  has  to  operate 
in  the  subthreshold  region,  which  are  widely  considered  as  an  inherent  disadvantage  in  the 
use  of  subthreshold. 
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